
Linking Student Achievement Growth to Professional Development Participation and Changes in Instruction: A Longitudinal Study of Elementary Students and Teachers in Title I Schoolsby Laura M. Desimone, Thomas M. Smith & Kristie J. R. Phillips  2013 Background/Context: Most reforms in elementary education rely on teacher learning and improved instruction to increase student learning. This study increases our understanding of which types of professional development effectively change teaching practice in ways that boost student achievement. Purpose/Objective/Research Question/Focus of Study: Our threeyear longitudinal analysis answers two main research questions: (1) To what extent do teachers’ topic coverage, emphasis on memorization and solving novel problems, and time spent on mathematics instruction, predict student mathematics achievement growth? (2) To what extent does teacher participation in contentfocused professional development predict the aspects of instruction found in our first analysis to be related to increases in student mathematics achievement growth? Population/Participants/Subjects : This study uses data collected by the U.S. Department of Education for the Longitudinal Evaluation of School Change and Performance (LESCP) in 1997, 1998, and 1999. The LESCP drew its sample from 71 highpoverty schools in 18 school districts in 7 states. Our studentlevel analyses include 7,588 observations over three years of 4,803 students assigned to 457 teachers. Teacherlevel analyses include the same 457 teachers in 71 schools over three years. Research Design: This is a quasiexperimental longitudinal study. To answer our first research question, we employ a 4level crossclassified growth model using MLwiN software, with time points nested within students, students crossclassified by teachers over the three years of the study, and teachers and students nested within schools. To answer our second question, we employ a series of hierarchical linear models (HLM) to test the relationship between instruction and professional development.
Conclusions/Recommendations: We found that (1) when teachers in third, fourth, and fifth grade focused more on advanced mathematics topics (defined as operations with fractions, distance problems, solving equations with one unknown, solving two equations with two unknowns, and statistics) and emphasized solving novel problems, student achievement grew more quickly; (2) when teachers focused more on basic topics (defined as measurement, rounding, multidigit multiplication, and problem solving) and emphasized memorizing facts, student achievement grew more slowly; and (3) when teachers participated in professional development that focused on math content or instructional strategies in mathematics (in Year 1), they were more likely to teach in ways associated with student achievement growth. Specifically, they were more likely to teach advanced topics and emphasize solving novel problems. Effect sizes ranged from 1% to 15% of a standard deviation.
INTRODUCTION This study examines relationships between teachers’ participation in professional development and changes in instruction, and between instruction and student achievement growth, from third to fifth grade. Using the Longitudinal Evaluation of School Change and Performance (LESCP), a national study of 71 Title I schools, we first examine to what extent specific aspects of the content of elementary teachers’ selfreported mathematics instruction predict individual student achievement growth from third to fifth grade. Then we examine to what extent participation in mathematicsrelated professional development predicts whether teachers will emphasize the dimensions of content that we used to predict student achievement growth. We focus on elementary mathematics for two reasons: because the nation’s education policy agenda places great importance on early mathematics (e.g., U.S. Department of Education, 2008), and to reduce the confounding factors of grade and subject. Most reforms in elementary education rely on teacher learning and improved instruction to increase student learning. Research has increasingly shown that continuing development and learning by teachers is critical to improving student learning (e.g., Borko & Putnam, 1995; Carnegie Forum on Education and the Economy, 1986; National Commission on Teaching and America’s Future, 1997). Thus we need to better understand which types of professional development effectively change teaching practice in ways that boost student achievement. STUDIES LINKING PROFESSIONAL DEVELOPMENT, TEACHING, AND STUDENT ACHIEVEMENT Prior work suggests how links between professional development and instruction might operate (e.g., O’Sullivan & McGonigle, 2010), helps establish the relationship between professional learning and changes in instructional practice (Desimone, Porter, Garet, Yoon, & Birman, 2002; Parise & Spillane, 2010), and explores how professional learning affects teacher knowledge (Carlisle, Correnti, Phelps, & Zeng, 2009; Goldschmidt & Phelps, 2010; Kanter & Konstantopoulos, 2010). However, only a few studies have attempted to empirically link professional development to student achievement. A brief review of such studies demonstrates that their findings are mixed; moreover, few are longitudinal or include instruction as a mediating variable. LINKS BETWEEN PROFESSIONAL DEVELOPMENT AND STUDENT ACHIEVEMENT Using a quasiexperimental design, Jacob & Lefgren (2004) found that a marginal increase in teacher inservice training had no statistically or academically significant effect on either reading or math achievement for third through sixthgraders in Chicago public schools. The authors cautioned that interpretations of their findings should consider that they had no measure of the content or quality of professional development, and that the training was not well aligned with district or school priorities. Bressoux (1996), using a quasiexperimental design in a study of French elementary schools, found that teacher training increased thirdgraders’ mathematics achievement by about onefifth of a standard deviation. Dildy (1982), in a very small, randomized trial of 16 teachers, found that professional development improved student achievement. Cobb, Wood, Yackel, Nicholls, Wheatley, Trigatti, & Perlwitz, (1991) found that the students of 10 teachers who participated in mathrelated professional development had higher levels of conceptual understanding in math than did students of 8 control teachers, but they found no differences in computational ability. Similarly, in a study of 23 teachers, Saxe, Gearhart, and Nasir (2001) tested two contrasting professional development programs designed to improve implementation of a fractions unit: one enhanced teachers’ knowledge of subject matter and children’s knowledge of math, and the other gave teachers opportunities for collegial interaction; a third group was the control group. They found that students whose teachers participated in the contentfocused professional development had higher posttest scores on conceptual measures, while the control group’s students had higher scores on computation than did students of the group that received collegial support. Angrist and Lavy (2001), in a matched comparison of Jerusalem public elementary schools, found that a 6 to 12hour teacher training program produced gains of about half of a standard deviation in reading and mathematics scores. LINKS AMONG PROFESSIONAL DEVELOPMENT, INSTRUCTION, AND STUDENT ACHIEVEMENT. Several studies of professional development effects have included instruction as a mediating variable. In a landmark study, Carpenter, Fennema, Peterson, Chiang, and Loef (1989) applied a professional development intervention that emphasized how students learn math to a randomly assigned group of 20 firstgrade teachers. The study measured several teaching and achievement outcomes, and found significant effects for some of them. Specifically, the authors found that teachers used more word problems and that students better recalled facts and performed better on complex addition and subtraction. They did not, however, find any effects on standardized test scores. Effect sizes ranged from about onehalf to 1 standard deviation. To put this finding in context, most studies that link implementation of reforms to student achievement have had much smaller effect sizes—for example, in mathematics, Balfanz, Mac Iver, and Byrnes (2006) found an effect size of 0.20 of a standard deviation, and Hamilton et al. (2003) found an effect size of less than 0.10 of a standard deviation; in science, Heck, Banilower, Weiss, and Rosenberg (2008) found effect sizes ranging from 0.11 to 0.38 of a standard deviation. Wiley and Yoon (1995) studied fourth, eighth and 10thgrade California mathematics teachers in a 1year study of natural variation, which relied on correlations and 1item indicators of teaching practice. They found links among teachers’ exposure to reformoriented practices, implementation of those practices, and student achievement, though results were strongest for fourth grade. Using aggregate schoollevel measures of achievement, Cohen and Hill (2001) found that teacher professional development programs with sufficient duration and subject matter focus were related to small increases in reformoriented practice (about half a standard deviation) and schoolaggregated student performance (fourthgrade scores were about onesixth of a standard deviation higher for students whose teachers used reform practices than for students whose teachers did not use such practices). Most recently, Glazerman et al. (2008) and Garet et al. (2008) used randomized controlled trials to examine how teachers’ learning affects their knowledge, their instruction, and their students’ achievement. These studies found no effects on student achievement and few effects on teachers, though Garet et al. did find that content focus and duration of professional development had significant effects on teacher knowledge and practice. Though each of these studies contributes to our understanding of how professional development works to improve teaching and learning, each has its limitations. For example, Wiley and Yoon (1995) relied on crosssectional correlational data, Carpenter et al. (1989) was conducted on a small sample of teachers, Cohen and Hill (2001) used schoolaggregated rather than individual achievement, and Glazerman et al. (2008), and Garet et al. (2008) tested a very specific professional development intervention. (Garet et al., for example, tested the effectiveness of professional development on rational numbers.) RESEARCH QUESTIONS AND CONTRIBUTIONS OF OUR STUDY Our analysis contributes to this line of work by answering two main research questions: (1) To what extent do teachers’ topic coverage, emphasis on memorization and solving novel problems, and time spent on mathematics instruction, predict student mathematics achievement growth? and (2) To what extent does teacher participation in contentfocused professional development predict the aspects of instruction found in our first analysis to be related to increases in student mathematics achievement growth? Compared to previous studies, our work has unique analytic and design features. First, we look longitudinally at teachers’ instructional practices and participation in professional development over three years and measure students’ achievement growth over those three years associated with their teachers’ instructional practices. Second, we examine change in teaching practices as a mediator of professional development’s effects. Third, we measure conceptual emphasis, operationalized as solving novel problems, and procedural emphasis, operationalized as memorization, as well as coverage of basic and advanced topics (e.g., calibrated to typical gradelevel topic coverage, long division is a basic topic for a fifthgrader and multistep equations are advanced). This allows us to compare how different topic and cognitive demand emphases affect achievement growth. Fourth, we examine how content focus, an important feature of quality professional development, is related to teachers’ topic and cognitive demand coverage. Fifth, we conduct the analysis on a substantial sample of teachers and students. By combining these design elements, we can answer our research questions using multilevel longitudinal growth modeling, uncommon in studies linking professional development, instruction, and student achievement. Further, our study is conducted with a sample of teachers and students in Title I schools, which spend significant amounts of money on interventions to help boost achievement for large numbers of struggling students; these interventions often have teachers’ professional development at their core. The professional development we are studying resulted from a major policy initiative to improve teaching and learning in highpoverty schools, thus raising a policy question: Can such a policy change practice and improve student learning? We study professional development available to “typical” teachers, rather than a targeted program available to only a few teachers. The work responds to recent calls for largerscale empirical evidence of links among professional development, instruction, and student achievement (Geijsel, Sleegers, Stoel, & Kruger, 2009; Gersten, Dimino, Jayanthi, Kim, & Santoro, 2010; Goldschmidt & Phelps, 2008; Parise & Spillane, 2010; U.S. Department of Education, 2008). We test two theories described in Wayne, Yoon, Zhu, Cronen, and Garet (2008). First, we test a theory of teacher change, which hypothesizes that contentfocused professional development fosters an increase in teachers’ coverage of advanced topics and conceptual cognitive demands; second, we test a theory of instruction—specifically, that advanced topic coverage and emphasis on solving novel problems are associated with a faster rate of achievement growth, and, conversely, that basic topic coverage and an emphasis on memorization are associated with a slower rate of achievement growth. CONCEPTUAL/THEORETICAL GROUNDING Our study is grounded in a conceptual/theoretical framework that reflects how professional development might affect student achievement, shown in Figure 1. We test only a small part of this framework, but the relationships we test are part of the broader conception of how professional development impacts teacher knowledge and practice and, in turn, student learning. A conceptual framework for studying professional development has at least two central components. One is recognizing a set of critical features that define effective professional development. The second is establishing an operational theory of how professional development works to influence teacher and student outcomes. Figure 1. Proposed Core Conceptual Framework for Studying the Effects of Professional Development on Teachers and Students Our framework identifies key inputs as well as intermediate and final outcomes. It also identifies the variables that mediate (explain) and moderate (interact to influence) the effects of professional development. The model represents relationships among critical features of professional development, teacher knowledge and beliefs, classroom practice, and student outcomes. As Figure 1 shows, a core theory of action for professional development would likely follow these steps: (1) teachers experience professional development with effective features; (2) the professional development increases teachers’ knowledge and skills and/or changes their attitudes and beliefs; (3) teachers use their new knowledge and skills, attitudes, and beliefs to improve the content of their instruction, their pedagogy, or both; and (4) these instructional changes foster increased student learning^{1} (see Desimone, 2009). The model reflects the nonrecursive and interactional nature of the relationships among elements in our conceptual framework (though here we test only unidirectional relationships). The literature reflects the importance of each element in our framework—links between teacher knowledge, practice, and student achievement (Hill, Ball, & Schilling, 2008; Phelps & Schilling, 2004); between instruction and student achievement (e.g., Desimone & Long, 2010; Hamilton et al., 2003; von Secker, 2002); between professional development and teacher practice (Desimone et al., 2002; Fishman, Marx, Best, & Tal, 2003; Heck et al., 2008); between professional development and student achievement (e.g., Angrist & Lavy, 2001; Bressoux, 1996; Cohen & Hill, 2001; Jacob & Lefgren, 2004); and between “opportunity to learn,” or time spent on instruction, and student achievement (e.g., Boscardin et al., 2005; Gamoran, Porter, Smithson, & White, 1997).^{2} We readily acknowledge that context affects these relationships. We are examining average effects in highpoverty Title I schools. Further, we are not evaluating the effectiveness of a particular professional development program. Previous work has found that specific aspects of a curriculum or program can make certain features of professional development more or less effective (Penuel, Fishman, Yamaguchi, & Gallagher, 2007); we do not investigate this potentially confounding factor. Rather, we are estimating the influence of sustained, mathfocused professional development to suggest whether, on average, investing in such professional development is likely to influence teaching and learning. Consistent with most social science research, we test some but not all of the complex theoretical model or framework that describes the phenomena we study. Thus, while we hypothesize nonrecursive relationships, our analysis focuses on understanding the unidirectional relationships among professional development, teaching, and student achievement. Empirical studies that include all the elements in our framework are rare, though a handful of studies have addressed the links in all four areas—professional development, content knowledge, instruction, and student achievement (Carpenter et al., 1989; Cobb et al., 1991; Franke, Carpenter, & Levi, 2001; Saxe et al., 2001). Our study focuses on only a portion of this framework—the link between the content focus of professional development and change in the content of instruction, and the link between the content of instruction and student achievement growth in elementary mathematics. Below we explain our focus on memorization and solving novel problems, within the domains of conceptual and procedural mathematics, and on time spent in mathfocused professional development. CONCEPTUAL EMPHASIS IN MATHEMATICS TEACHING AND STUDENT ACHIEVEMENT We study specific aspects of conceptual and procedural emphasis in mathematics instruction. Conceptual mathematics instruction, sometimes called “reformoriented” instruction, seeks to foster a deep understanding of fundamental mathematics principles, ideas, and connections. It often includes emphasis on realworld problem solving, student reflection and discussion, application of ideas to novel problems, and using inquiryoriented investigation techniques (Hiebert et al., 1996). Procedural or computational mathematics teaching, in contrast, emphasizes drill, memorization, and the performance of routine procedures. This dichotomy is well established in the math education literature (e.g., Cobb et al., 1991; Saxe et al., 2001.), and the “conceptual” and “procedural” domains are commonly used in studying mathematics teaching and learning (e.g., Carpenter, Franke, Jacobs, Fennema, & Empson, 1998; Hiebert & Wearne, 1996). Studies have demonstrated benefits from both conceptual instruction (e.g., Desimone & Long, 2010; Fennema, Carpenter, Franke, Levi, Jacobs, & Empson, 1996; Loveless, 2001; Romberg, 2000; Stein & Lane, 1996) and procedural instruction (Geary, 2001; Hirsch, 2001; Slavin, Madden, Karweit, Livermon, & Dolan, 1990). These competing literatures were thoroughly reviewed most recently by the National Mathematics Advisory Panel (U.S. Department of Education, 2008), which concluded that the field should recognize “the mutually reinforcing benefits of conceptual understanding, procedural fluency, and automatic (i.e., quick and effortless) recall of facts” (p. xiv). Our purpose is not to contribute to debates about when and how to use procedural or conceptual mathematics instruction. Instead, our study is based on the premise that increased use of conceptual instruction holds promise for reducing the achievement gap. Typical mathematics instruction in the US is procedural (Desimone, Smith, Baker, & Ueno, 2005; Schmidt, McNight, & Raizen, 1997), and lowerachieving students are much more likely to receive predominantly computational/procedural teaching compared to their average and higherachieving counterparts (e.g., Barr, & Dreeben, 1983; Desimone, Smith, & Frisvold, 2007; Gamoran, 1986). There is substantial evidence that conceptual instruction boosts learning for all students (e.g., Schoenfeld, 1985; Silver, 1985; Stein & Lane, 1996; von Secker, 2002) and that it may be especially advantageous for lowachieving students (Carpenter et al., 1989; Hiebert, 1999; Knapp, 1995; Wenglinsky, 2004). If a balance of instruction is desirable, as the Mathematics Panel concluded, and procedural instruction is the most common form, especially for lowachieving students, it seems appropriate to find ways to foster increased use of conceptual instruction, especially among teachers of lowachieving students. MEMORIZATION VS. SOLVING NOVEL PROBLEMS To operationalize the ends of the procedural/conceptual continuum, we contrast memorization and solving novel problems. This is consistent with work in the teaching and learning of mathematics, which grounds one or a few targeted measures within the broader domain of conceptual and procedural instruction (e.g., RittleJohnson & Alibali, 1999; RittleJohnson & Star, 2007). For memorization, skill and knowledge mastery is the goal, whereas for solving novel problems, the ability to understand, apply and transfer is the goal (see Ernest, 1989). In mathematics teaching and learning, “solving novel problems” is commonly understood as asking students to apply previous knowledge to a different situation, and/or to devise or evaluate unfamiliar procedures (RittleJohnson & Alibali, 1999). Novel transfer problems are problems that can be solved by modifying learned solution methods to new problem features (Paas & van Merrienboer, 1994; Singley & Anderson, 1989). If children develop flexibility in solving problems and a deep understanding of underlying fundamental concepts, they can apply their knowledge and adjust the procedures to conquer unfamiliar problems (Briars & Siegler, 1984; Cauley, 1988; Cowan & Renton, 1996; RittleJohnson & Alibali, 1999; RittleJohnson & Star, 2007; Siegler, 1996). This type of transference to novel math problems is an important component of mathematical competence (Star & Seifert, 2006), well established as a desirable type of conceptual understanding (e.g., Perry, 1991). In contrast, when children focus on memorization and procedures, they often do not master concepts in a way that allows them to transfer their knowledge to novel problems (Fuson, 1990; Hiebert & Wearne, 1996; Kouba, Carpenter, & Swafford, 1989). As a result, many scholars and educators argue that learning should go beyond rote memorization to foster students’ conceptual understanding and ability to transfer to novel situations (Kazemi & Stipek, 2001; Li, 2008). Nonetheless, memorizing math facts and formulas has been a fundamental and traditional part of U.S. mathematics education for decades (Stigler & Hiebert, 1999), and research continues to document that the predominant mode of instruction in the US is memorization and procedural (e.g., Rowan, Harrison, & Hayes, 2004). Though memorization unquestionably plays a critical role in math learning (U.S. Department of Education, 2008; Kilpatrick et al., 2001), it is less clear when and how much to use it (Loveless, 2001); it is particularly unclear how to translate memorization from rote knowledge into understanding (Cai & Wang, 2010). An overreliance on memorization can be detrimental to fostering the conceptual understanding necessary to solve novel problems. When learning is based only on memorization, students have trouble solving novel problems that require different steps (Catrambone, 1994); still, learners rely heavily on previously memorized examples when solving new problems (LeFevre & Dixon, 1986). The challenge for teachers is to identify what must be memorized in order to solve problems in a certain domain (Catrambone, 1994), and what must be conceptually understood to help students make connections and apply knowledge to new types of problems (Benson & Malm, 2001). Certainly, it is not the norm in mathematics teaching to foster multiple ways of solving problems that facilitate knowledge transfer to new situations (Ball, 1993; Lampert, 1990). Further, though memorization can lead to increased scores on standardized tests, it may result in fragmented understandings of math, and little understanding of connections among mathematical concepts, which impedes learning advanced concepts (Schoenfeld, 1988). IDENTIFYING FEATURES OF EFFECTIVE PROFESSIONAL DEVELOPMENT: AN EMPHASIS ON CONTENT Recent research reflects a consensus on some of the characteristics of professional development that increase teacher knowledge and skills, improve their practice, and, to a lesser extent, influence student achievement (Desimone, 2009). These characteristics include whether the professional development activity (1) is focused on subjectmatter content or how students learn content (e.g., Cohen & Hill, 2001; Desimone et al., 2002; Kennedy, 1998; Garet et al., 2001); (2) provides opportunities for teachers to engage in active learning (Garet et al., 2001; LoucksHorsley et al., 1998), for example, observation, interactive feedback, and analyzing student work (Banilower & Shimkus, 2004; Borko, 2004); (3) is consistent with teachers’ knowledge and beliefs (Consortium for Policy Research in Education, 1998; Elmore & Burney, 1997), and with school, district, and state reforms and policies (Elmore & Burney, 1997; Firestone, Mangin, Martinez, & Polovsky, 2005; Penuel et al., 2007); (4) is of sufficient duration, including both the span of time over which the activity is spread (e.g., one day or one semester) and the number of hours spent on the activity (Cohen & Hill, 2001; Supovitz & Turner, 2000); and (5) includes collaboration among teachers from the same school, grade, or department (Banilower & Shimkus, 2004; Borko, 2004; Little, 1993; Rosenholtz, 1989). The content focus of teacher learning may be the most influential feature, and therefore the one we focus on in this study. In the past decade, a growing body of evidence has suggested that professional development that emphasizes subject matter content and how students learn that content (1) increases teachers’ knowledge and skills, and (2) improves instruction in ways likely to result in increased student learning. This evidence comes from case study data (e.g., Wilson & Ball, 1991); correlational analyses conducted with nationally representative teacher data (e.g., Garet et al., 2001); quasiexperiments (Banilower, Heck, & Weiss, 2005); longitudinal studies of teachers,(e.g., Cohen & Hill, 2002; Desimone et al., 2002); metaanalyses (e.g., Kennedy, 1998); and experimental designs (e.g., Carpenter et al., 1989). The main hypothesis supporting a relationship between contentfocused professional development and teacher learning and change is that to teach conceptually, teachers must first build their own knowledge of a subject and of how students learn that subject. The idea is that teachers must have a deep grasp of content so that they understand common mistakes and misunderstandings in student thinking, and alternative ways of solving a problem (Hill et al., 2008; Ma, 1999), both of which are necessary for more challenging, conceptual teaching. In our analysis, we concentrate on the time spent in mathematicsfocused professional development. STUDIES USING THE LESCP Our study builds on work by Raudenbush, Hong, and Rowan (2003), who used the LESCP to examine the relationship between “highintensity” mathematics instruction and student achievement growth, in the context of a paper testing causal modeling methods. To create a measure of the intensity of instruction, they rated topics by difficulty and created an index reflecting the difference between more difficult topics and less difficult topics taught by a teacher in a specific year; their index guided our categorization of topics. They found that highintensity instruction had some effects on achievement, from 0.15 to 0.23 of a standard deviation, but results were not consistent across grades. Our study builds on theirs by examining the relationship of professional development to teaching practice, by including teachers’ cognitive demand emphasis in our measure of instruction, and by examining change over three years. Wong, Meyer, & Shen (2003) also conducted an analysis of LESCP, though their findings were reported on the school level. Thus, student outcomes were not linked to specific teachers. They did, however, find evidence suggesting positive relationships among professional development, reformoriented teaching, and student achievement. METHODS DATA This study uses data collected for the LESCP in 1997, 1998, and 1999 by what was at the time the U.S. Department of Education’s Planning and Evaluation Service. LESCP drew its sample from 71 Title I schools in 18 school districts in 7 states. Because the LESCP was part of an evaluation of Title I, the students in the sample largely came from lowincome families with diverse ethnic backgrounds (Westat & Policy Studies Associates, 2001a). The schools in the study were drawn from a select sample of highpoverty schools, which is neither representative nor random. The schools were chosen because they were among the earliest to implement standardsbased reform. As such, they offer a useful set of data for considering how to advance standardsbased reform on a national level, especially in highpoverty schools. The original sample consisted of all teachers in the 71 schools, with an average of 20 teachers per school. Our sample includes all students in each year who were linked to valid math achievement tests scores, and all teachers who specified that they taught mathematics in each year and also reported teaching grades three to five.^{3} Student achievement tests and teacher surveys were administered in the spring of each year. Our studentlevel analyses include 7,588 observations over three years of 4,803 students assigned to 457 teachers. Teacherlevel analyses include the same 457 teachers in 71 schools over three years. Listwise deletion was used in all analyses, including cases with missing data.^{4} From the full teacher sample, we identified our analytic sample by limiting it to teachers who taught third to fifthgrade mathematics, and who had students who took the openended mathematics test (explained in more detail in the Measures section below). Further, we used only students who were part of the thirdgrade cohort. The LESCP added students who were not part of the original cohort; we did not use these students because we wanted a longitudinal sample of students to measure growth. Also, several schools and teachers had missing data on key variables, which we were unable to accurately impute due to missing data on most measures for the teacher/school. For example, if a teacher’s information was missing on one variable, it was likely missing on all variables. This is partly because at some schools, no teachers completed the survey; this phenomenon was concentrated in a few districts. From the overall sample of 1,829 teachers, then, 1,666 were math teachers, 897 of who were teachers of grades three through five, of whom 726 had students who took the openended mathematics test. Six hundred thirtyfour of the 726 taught the thirdgrade cohort that took the openended mathematics test; of those 634, 457 did not have missing school or teacher data. Missing data on all dependent variables (achievement as well as cognitive demand variables in the teacher models) was left as missing. Missing data on all other variables, which was primarily missing at the student level, was imputed using regression imputation in order to maintain realistic variance. After cases were deleted because of missing data on the dependent variables, little missing data remained. Of the 4,803 students in our cohort sample, 2,655 had one observation, 1,511 had two observations, and 637 had three observations. Turnover is common in Title I schools, so it is not surprising that some students did not have three full years of observations. However, because the LESCP refreshed the sample to deal with the severe attrition, the data lose only about 340 observations per year. We chose to keep the refreshed students to maintain a reasonable degree of freedom to estimate our models. A comparison of our analytic sample with the full LESCP sample showed general consistency across both samples for all three years in terms of student race/ethnicity, income, individualized education program (IEP) status, limited English proficiency (LEP) status, and student achievement status. Differential Attrition We tested for differential attrition by comparing the number of students in the study for each year by race/ethnicity and by free and reducedprice lunch, IEP status, and LEP status (not shown). There were small fluctuations in the sample, but changes were consistently quite small. We also examined the percentage change in students with average or above average achievement in the sample, and found a 5% increase over the three years. MEASURES Our analyses represent an attempt to measure two relationships hypothesized to impact student achievement: First, we measure the relationship between 1) growth in student math achievement, and 2) teacher topic and cognitive demand emphasis and time spent on mathematics; second, we measure the relationship between 1) teacher topic and cognitive demand emphasis and time spent on mathematics, with 2) teachers’ participation in professional development over three years. We control for student, teacher, and school characteristics that are likely to be related to student and teacher outcomes. Table 1 describes each variable used in the analyses and outlines how the variables were created and coded. The table also lists the means and standard deviations (where applicable) for each variable for each year of the study, as well as for all years combined. These means help identify trends over time between the dependent and independent variables. Table 1. Bivariate Correlations for Student Achievement Models
Student Mathematics Achievement The LESCP administered the Stanford Achievement Test, Ninth Edition (SAT9), to all participating students (most third to fifthgraders in the schools selected for evaluation). The SAT9 is a normreferenced achievement test with two sections for mathematics. Separate scores were obtained for closed and openended mathematics items. All 18 districts in the LESCP study participated in the openended mathematics test; however, not every district required its students to take the closedended portion (Westat and Policy Studies Associates, 2001a, 2001b). Because the sample of students who took the closedended test is significantly smaller than the sample of students who took the openended test, we use the openended mathematics test scores in our studentlevel analyses. The closedended and openended mathematics items test the same skill set. Both assess problem solving (reasoning, communication, connections, and thinking skills) and procedures (facts and computation). The SAT9 openended mathematics assessment score includes 9 questions or tasks constructed around a single theme. Each question is intended to measure students’ ability to communicate and reason mathematically and to apply problemsolving strategies. The content clusters for the openended mathematics test include number concepts, patterns and relationships, and concepts of space and shape. The scores are vertically scaled. We recognize the debate over which assessments most accurately measure the breadth and depth of student learning: standardized tests, curriculumbased tests, portfolios, or other alternative assessments. Given that the test we use is a standardized test, we consider its results to be a conservative estimate of student mathematical ability, given that studies are much more likely to find effects on curriculumaligned tests than on standardized tests (e.g., Carpenter et al., 1989). Instruction We measure time spent on mathematics instruction, topic focus, and two cognitive demands: emphasis on memorizing facts and solving novel problems. Student opportunity to learn, defined as time spent on instruction in the classroom, has for several decades been shown to matter for student achievement (e.g., Carroll, 1963; Gamoran, Porter, Smithson, & White, 1997; Guarino, Hamilton, Lockwood, Rathbun, & Hausken, 2006). It is especially salient for disadvantaged students, who often do not receive highquality educational experiences outside of school (Alexander, Entwisle, & Olson, 2001). We include two aspects of the content of instruction: topics (e.g., geometry, measurement) and type of learning required, or cognitive demand (e.g., memorization). We contrast basic with advanced topics, and memorizing facts with solving novel problems. Though reasonable alternatives likely exist to our conception of “advanced” and “basic” topics for a certain grade, we ground our measures in national data that indicate the “average” topics for particular grades, allowing us to extrapolate that advanced topics are those typically taught in the next grade (see Raudenbush et al., 2003). We acknowledge that our measures do not reflect the depth and complexity of instruction (e.g., Good & Brophy, 2000). Our measures are, however, derived from recent work in the teaching and learning of mathematics, which distinguishes procedural (e.g., memorization) and conceptual (e.g., solving novel problems) cognitive demands and/or learning goals (Cohen & Ball, 1990; Desimone, Smith, Hayes, & Frisvold, 2005; NCTM, 1989; Porter, 2002; Spillane & Zeuli, 1999; NCTAF, 1996), and which suggests that content is a stronger predictor of student achievement than is pedagogy (Pellegrino, Baxter, & Glaser, 1999; Porter, Kirst, Osthoff, Smithson, & Schneider, 1993). Our teacherlevel analyses use 5 selfreported instruction variables. These same teacher variables were also used as predictors in the studentlevel analyses. One is an indicator of how much time per day a teacher spent on mathematics. In addition, we constructed two composites to indicate a teacher’s topic focus: focus on basic math concepts and focus on advanced math concepts. Finally, we created composite measures to indicate a teacher’s emphasis on memorizing facts and solving novel problems. To construct a measure of how much time per day a teacher spent on mathematics instruction, we used the LESCP question that asked teachers to approximate the number of minutes per week they spent on mathematics. The number reported was then divided by 5 to create a measure of the number of minutes a teacher spent on mathematics per day. As we indicated earlier, items were classified into basic or advanced topics based on topics typically taught in Grades 3 through 5, as indicated by a review of several states’ standards, by NAEP fourthgrade instructional data, and by following the general categorization of topics in Raudenbush et al.’s (2003) LESCP study of elementary school math instruction. In our study, “teachers’ focus on basic math topics” is a 5item composite sum of the number of lessons a teacher taught (or planned to teach) in the following topic areas over the course of the year: (1) measurement (using number lines and rulers); (2) measurement (finding length and perimeter from pictures); (3) numbers and operations (rounding); (4) computation (multidigit multiplications); and (5) problem solving (word problems using addition and subtraction). Teachers indicated their focus on each of these five basic topic areas by reporting the number of lessons they taught (or planned to teach) during the school year. To create a continuous variable indicating the number of lessons a teacher taught on basic math topics, response categories were recoded based on the midpoint of each category. If teachers reported that they did not teach any lessons in a given topic area, their response was coded as 0 lessons per year; if they taught 12 lessons, their response was coded as 1.5 lessons per year. Likewise, teaching 35 lessons was coded as 4 lessons per year, teaching 610 lessons was coded as 8 lessons per year, and teaching 1115 lessons was coded as 13 lessons per year; teaching more than 15 lessons was coded as 20 lessons per year. Values of the composite range from 0 to 100. Similarly, “teachers’ focus on advanced math topics” is a 5item composite sum using the same basic construction as focus on basic math topics. Teachers were asked how many lessons they taught (or planned to teach) in these more advanced topic areas: (1) computation (e.g., operations with fractions), (2) problem solving (e.g., distance problems), (3) prealgebra (e.g., solving equations with one unknown), (4) algebra (e.g., solving two equations with two unknowns), and (5) statistics (e.g., determining central tendency). These five items were coded on a scale of 0 to 20, indicating the number of lessons taught per year. The same calculation and scaling procedures described for “teachers’ focus on basic math topics” also applied to “teachers’ focus on advanced math topics,” with values ranging from 0 to 100. Whether a set of topics was considered basic or advanced depended on the grade. For example, topics typically taught in fifth grade are considered advanced for fourth and third grade; those typically taught in third grade are considered basic for fourth and fifth grade. To construct measures of teacher emphasis on memorizing facts and solving novel problems, we combined time spent on topic with emphasis on memorization and solving novel problems. That is, we first estimated the amount of time that teachers spent on each of the 10 individual topic areas (described above), to weight the topic areas. Time spent on each topic was calculated based on the number of lessons each teacher reported teaching for each topic during the year. Based on the midpoints of the response categories, responses were recoded on the same 020 scale that we used for other instruction items, detailed above. Then, for each topic, emphasis on memorizing facts and solving novel problems was based on teacher responses to the question, “When you teach this topic, how much do you emphasize each of the following competencies: (1) memorize facts, and (2) solving novel problems?” Responses were recoded as No Emphasis = 0; Occasionally Emphasized = .33; Emphasized Moderately = .66; and Emphasized a Lot = 1. The created measures include a composite indicator of emphasis on memorizing facts and a composite indicator for solving novel problems. The LESCP does have other categories of cognitive demands. There was little variation in use of cognitive demands in the middle of the distribution—understanding concepts, solving equations, collecting/interpreting data, and solving word problems—so we focused on studying emphases at the two extremes of the cognitive demand continuum. Finally, we created each of our memorization and solving novel problems measures by summing the emphasis teachers placed on each cognitive demand when teaching the 10 topics. We then multiplied the sum by the number of lessons taught in each of the 10 topic areas, and divided the product by the total number of lessons a teacher taught across all topic areas. The value of each composite measure for emphasis on memorizing facts and solving novel problems ranges from 0 to 1. This measure allows teachers to emphasize both types of cognitive demands within a topic. Our analyses rely on teacher selfreports of their own instruction, which research shows is a reasonable method for measuring behaviors such as participation in professional development and behaviorally based instructional practices. Responses on confidential surveys like the LESCP are less susceptible to social desirability bias than more public forms of data collection, such as interviews and focus groups (Aquilino, 1994, 1998; Burstein et al., 1995; Dillman & Tarnai, 1991; Fowler, 1995; see Desimone & LeFloch, 2004, for a discussion of the uses and quality of survey data). Further, when survey questions ask teachers to account for their behaviors rather than evaluate or make quality judgments, as the LESCP measures do, the validity and reliability of teacher selfreport data can be high (Desimone, Smith & Frisvold, 2010; Desimone, 2006; Mullens & Gayler, 1999; Mullens et al., 1999). Teacher selfreports are less useful for measuring certain dimensions of teaching practice, such as teacherstudent interaction and teacher engagement. But several studies have shown that, on confidential sample surveys, teacher selfreports of the topics and cognitive demands they cover are highly correlated with classroom observations and teacher logs, which are daily, weekly or monthly teacher selfreports (Mullens & Kasprzyk, 1996, 1999; Smithson & Porter, 1994). Further, studies have shown that onetime surveys that ask teachers questions about the content and strategies that they emphasize can be reasonably valid and reliable in measuring teachers’ instruction (Mullens, 1995; Mullens & Gayler, 1999; Schmidt et al., 1997; Shavelson, Webb, & Burstein, 1986; Smithson & Porter, 1994) and effectively describe and distinguish among different types of teaching practices (Mayer, 1999). Participation in Professional Development Teacher participation in 3 types of professional development is measured and used to predict teacher practices only in the teacherlevel analyses. We contrast professional development focused on mathematics with that focused on reading, and with professional development that is focused on something other than content, such as working with parents. We include the reading/mathematics contrast to test the assumption that learning opportunities in a particular subject are more likely to change teaching practice in that subject. This serves as a more robust test of the link between content focus and teacher change than those used in most prior research in this area. “Professional development with a content focus in mathematics” is a 2item composite sum of the number of hours in the past 12 months that a teacher participated in the following types of professional development: (1) content in mathematics, and (2) instructional strategies in mathematics. Responses were recoded to indicate the number of hours a teacher engaged in this type of professional development: None = 0 hours; Less than a day = 4 hours; 12 days = 12 hours; More than 2 days = 20 hours. Responses ranged from 0 to 40 hours, with a mean of 18.14 hours. Similarly, “professional development with a content focus in reading” is a 2item sum of the number of hours in the past 12 months a teacher participated in professional development of the following types: (1) content in reading, and (2) instructional strategies in reading. The coding strategy for mathematics contentfocused professional development was used here as well. Values range from 0 to 40, with a mean of 20.27 hours. Our measure of other types of professional development includes the number of hours teachers spent in the following 4 types of noncontentrelated professional development: (1) strategies for using assessment results; (2) instructional strategies for teaching lowachieving students; (3) instructional strategies for teaching LEP students; and (4) strategies to increase or strengthen parental involvement. We used the same scale as in the previous two items. Possible values range from 0 to 80; the average response was 22.46 hours. Covariates While we are not able to examine all relevant student, teacher, and schoollevel factors, we do control for several such factors most likely to interact with the effectiveness of professional development in changing teaching practice in ways associated with student achievement. These are the grade taught by the teacher, the percentage of lowperforming mathematics students in the classroom, the teacher’s years of experience, school enrollment, and percentage of students in the school who receive free or reduced price lunch. Student characteristics. In all studentlevel analyses, we control for student race, gender, and participation in the free and reducedprice lunch program, as well as whether students have individualized learning plans (IEP) or demonstrate limited English proficiency (LEP). Student race is indicated with sets of dummy variables: white, black, and other. “White” is used as the reference category. Gender is a dichotomous variable, with female coded as 0 and male coded as 1. Participation in the free and reducedprice lunch program is also a dichotomous variable, with nonparticipants as the reference group. Similarly, IEP and LEP students are indicated with sets of dichotomous variables, with nonIEP and nonLEP students acting as the reference groups. For all years, there were 322 students with IEPs (5% of the sample) and 193 LEP students (3%). Because these student characteristics are stable over the three years of our study, they are used as timeinvariant predictors in our analyses. Classroom characteristics. In addition to student characteristics, we also control for classroom characteristics that have been shown to be associated with our outcome measures. In the teacherlevel models, we include sets of dummy variables to indicate the grade a teacher teaches (third, fourth, or fifth). Third grade is used as the reference group. Student ability is also assessed at the classroom level. Teachers were asked to identify the number of students in their classroom they considered to be low performers in mathematics. This number was divided by the total number of students in the class to measure the percentage of lowperforming students in each class. Based on Cohen, Raudenbush, and Ball (2003), we hypothesized that the likelihood teachers would emphasize conceptual cognitive demands might be negatively correlated with the percentage of lowperforming students in their classroom. Because these classroom characteristics change over time, they are used as timevarying predictors in our analyses. Teacher and school characteristics. In all analyses, we control for teacher experience. Teachers were asked, “Counting this year, how many years have you taught in total?” This number is used to indicate teachers’ total experience. A quadratic form of years of experience is also included in all analyses to assess whether the impact of experience diminishes over time. We also include measures of 2 schoollevel characteristics: school size and school poverty. Both measures come from the interviews with principals that were conducted as part of the LESCP study. School size is indicated by principal reports of total enrollment. School poverty is the percentage of students at each school who participated in the free and reduced lunch program, as reported by the principals. These measures are centered on their grand means and are included in both student^{5} and teacherlevel analyses. See the appendix, Table A.1, for the correlations of the independent variables in the study. Correlations are small; the highest is a .4 correlation between a student’s being black and the percentage of students at the school who qualify for free and reducedprice lunch. ANALYSES To investigate our two research questions, we conducted two sets of analyses. To answer our first research question—To what extent do teachers’ topic coverage, emphasis on cognitive demands and time spent on mathematics instruction predict student mathematics achievement growth?—our first set of analyses focuses on predicting student math achievement over time. To accomplish this, we employ a 4level crossclassified growth model using MLwiN software, with time points nested within students, students crossclassified by teachers over the three years of the study, and teachers (as well as students) nested within schools. Because students generally change teachers as they move from grade to grade, the structure of the data is not strictly hierarchical. That is, we cannot assume that each student belongs to one (and only one) teacher over the course of the threeyear study. This poses a problem for traditional hierarchical data. Because the students in our study cross contextual boundaries over time, a crossclassified model is necessary to assess the impact of teacher practices on student achievement (Raudenbush & Bryk, 2002; Raudenbush et al., 2003). Multilevel crossclassified models are not common, but we found two recent analyses to guide our approach: a study looking at family, school, and locality effects on children’s education (Rasbash, Leckie, Pillinger, & Jenkins, 2010) and a study of school and neighborhood effects (Leckie, 2009), both published recently in the Journal of the Royal Statistical Society. We also used the MLwiN manual, which describes how to implement these complex models (http://www.cmm.bristol.ac.uk/MLwiN/download/MCMC09.pdf). The crossclassified model is constructed by including all timevarying student variables in the timevarying student portion of the model, also referred to as the withincell portion. In our model, these include the outcome variable (student math achievement) and a growth trajectory indicator. This indicator is a variable coded 0 for the 19961997 school year, 1 for the 19971998 school year, and 2 for the 19981999 school year. The betweencell portion of the model includes all student timeinvariant variables (sometime referred to as rowlevel predictors), including student race, gender, free and reduced lunch participant, IEP status, and LEP status. Because the betweencell portion of the model accounts for the crossclassification of students and teachers, it also includes all classroom and teacher characteristics (also called columnlevel predictors). This portion of the model includes the following variables: percentage of lowperforming students in the classroom, teachers’ years of experience, years of experience squared, minutes per day spent on math, focus on basic math topics, focus on advanced math topics, emphasis on memorizing facts, and emphasis on solving novel problems. To account for schoollevel variation using MLwiN, we included the school characteristics of school enrollment and school poverty in the betweencell portion of the model. Both variables are grandmean centered. By entering these school variables into the betweencell portion of the model, MLwiN accounts for the nesting of students and teachers within schools and computes the corresponding schoollevel variance components. (See the appendix for an illustration of our crossclassified model predicting student math achievement.) Because we want to assess how teaching influences achievement growth over time, we also include a growth portion in our crossclassified model. We do this by including crosslevel interactions between the growth trajectory (from the timevarying student portion of our model) and the variables entered in the betweencell portion of the model (including interactions between the growth trajectory and all student timeinvariant background variables, all teacher and classroom variables, and all schoollevel variables). Please see the appendix for more details on our modeling approach. To answer our second research question—To what extent does teacher participation in contentfocused professional development predict the aspects of instruction found in our first analysis to be related to increases in student mathematics achievement growth?—our second set of analyses is a series of hierarchical linear models (HLM) to test the relationship between the amount of time teachers spend on specific instructional strategies and teacher participation in professional development. In each of these models, time points are nested within teachers and teachers are nested within schools. Because the nesting structure of teachers within schools and observations within teachers is truly hierarchical (unlike the student data), crossclassified models are not necessary. Five models are run separately. We predict the amount of time teachers spend on math instruction, their focus on basic math topics, their focus on advanced math topics, and the emphasis they place on procedural and conceptual cognitive demands. Level 1 of each model includes timevarying teacher and classroom variables, including grade taught, percentage of lowperforming students in math, all three measures of professional development participation, and a growth trajectory variable to measure any potential change in the dependent variables during the three years of the study. Level 2 of each model includes all timeinvariant teacher variables. In these models, we include teachers’ years of experience and years of experience squared. At Level 3, we include school characteristics of school size and school poverty, both of which are grandmean centered. Thus, in our models predicting instruction, we estimate for each year whether hours spent in professional development predict teacher emphasis in that year. The models measure the average relationships between professional development and instruction over the three years of the study. In initial analyses not reported here, we explicitly modeled growth in our dependent variables over time; however, nothing in the change/growth portion of the models was significant. Therefore, there was no need to analyze hierarchical growth models. As the means and standard deviations in Table 1 show, there is considerable variation in the amount of professional development in which teachers participated. Teachers who increased their amount of professional development between Year 1 and Year 3 do not necessarily have higher outcomes (in terms of instruction and student achievement), but teachers who took more professional development in Year 1 had greater change over the next three years than teachers who had less professional development in Year 1. RESULTS TO WHAT EXTENT DO TEACHERS’ TOPIC COVERAGE, COGNITIVE DEMAND EMPHASIS AND TIME SPENT ON INSTRUCTION PREDICT STUDENT MATHEMATICS ACHIEVEMENT? Table 2 contains the results of the first analysis, which shows the extent to which topic coverage, cognitive demand emphasis and time spent on mathematics predict student mathematics achievement growth. Coefficients for the control variables are in the expected directions; that is, the percentage of students qualifying for free and reduced price lunch and the percentage of lowperforming students in class increase as achievement status and growth decrease. Further, as Table 2 shows, being black, being eligible for free and reduced priced lunch, and having an IEP are significantly negatively related to achievement status.
Increased emphasis on more advanced topics and solving novel problems were associated positively with achievement growth, whereas increased emphasis on basic topics and memorization were associated negatively with achievement growth. Specifically, as Table 2 shows, a focus on basic math topics predicted slower than average growth in math achievement (b = 0.042, p<.036), while a focus on advanced math topics predicted faster than average growth (b = .061, p<.045). Since average achievement growth is 20.37 points (see Table 2, intercept for growth trajectory), both of these effects are small, equivalent to about oneseventh (15%) of a standard deviation. (To calculate this effect size, we used the following computation: (.042*21.19)/5.80, where 21.19 is the standard deviation of time spent on basic mathematics topics (see Table 1) and 5.807 is the standard error of the growth trajectory slope (see Table 2). Minutes per day spent on mathematics did not significantly predict either initial achievement status or growth. Results were similar for emphasis on memorizing facts. Increased emphasis on memorizing facts was associated with slower than average growth in achievement (b = 6.02, p < .037); this is equivalent to 7.5% slower growth for an increase of one standard deviation in emphasis on memorizing facts (18.99 points, compared to the average of 20.37 points). Emphasis on solving novel problems was associated with extremely modest achievement growth (b = .69, p < 0.041), which means a 1unit increase in teachers’ emphasis on solving novel problems was associated with only about a 1% increase in the rate of growth of student math achievement. There was an association between focus on advanced math topics and initial achievement (b = 0.172, p < .032), but there was no relationship between cognitive demand emphasis and initial achievement status. The initial relationship between teacher’s focus on advanced topics and higherachieving students could be evidence of sorting. Or it could reflect a school or curriculum that emphasizes advanced topics in earlier grades. This a priori relationship does not affect our estimations, since we are looking at growth. TO WHAT EXTENT DOES TEACHER PARTICIPATION IN CONTENTFOCUSED PROFESSIONAL DEVELOPMENT PREDICT THE ASPECTS OF INSTRUCTION FOUND IN OUR FIRST ANALYSIS TO BE RELATED TO INCREASES IN STUDENT MATHEMATICS ACHIEVEMENT GROWTH? Table 3 shows that teachers who participated in professional development with a focus on mathematics were significantly more likely to increase their focus on advanced topics in their classrooms (b = 0.147, p < .021). An increase of one standard deviation in taking mathfocused professional development was associated with an increase of an increase of about 0.10 of a standard deviation in focus on advanced math topics ((.147 * 13.50) / 17.52). Average teacher change on use of advanced topics was not significantly different from zero (b = 1.254, p = 0.508); thus the change that occurred in response to contentfocused professional development was substantive (10% of a standard deviation versus 0). Eleven percent of a standard deviation translates into moving 1.9 points on the scale, where 1.5 = a frequency of 12 lessons and 4 = a frequency of 35 lessons.
Participation in professional development was not related to change in how much time teachers spent on mathematics or how much they focused on basic topics. No other relationships were significant. Participating in professional development focused on reading, or professional development focused on other topics (using assessments, teaching lowachieving or limitedEnglish proficient students, or parent involvement), was not significantly related to the amount of time spent on math, or to the focus on basic or advanced topics. Similarly, as Table 4 shows, mathematicsfocused professional development was associated with an increased emphasis by teachers on solving novel problems (b = .001, p < .008), but not with an emphasis on memorizing facts. Average teacher change was not significant (b = 0.042, p = 0.336). Thus, while participating in contentfocused professional development significantly predicted teachers’ increased emphasis on solving novel problems, the size of this increase is minimal—an increase of one standard deviation in participating in contentfocused professional development was associated with an increase of half of one percent (0.05%) of a standard deviation in emphasis on solving novel problems ((.0.0010 * 13.50) / 0.28). And in Table 3, participation in professional development focused on reading or other topics besides mathematics was not significantly associated with teachers’ cognitive demand emphasis in the classroom^{6}.
DISCUSSION In interpreting our findings, certain strengths and limitations of the design and analysis should be considered. In answering our research questions, the strengths of the study are that (1) teachers are linked to specific students; (2) the data extend over three years, allowing us to measure a growth trajectory for both teachers and students; and (3) the survey data provide reasonable measures of important aspects of professional development and instruction. Limits that should be considered in interpreting the findings include the following: (1) not all teachers who were in the first cohort remained in all three cohorts; (2) the sample is representative of the students of Title I teachers in Grades 3 to 5, in areas where there was an early push for standardsbased reform (Westat & Policy Studies Associates, 2001b); (3) while longitudinal data allow more confidence in causal assumptions than do crosssectional data, an experimental design would provide stronger evidence of causeandeffect relationships; and (4) we do not have a measure of teacher’s content knowledge, which would give us a more complete model of how professional development impacts teachers. Also, some important features of the qualities of professional development and instruction are not captured in our analysis. For example, we do not measure the quality with which teachers implement memorization or solving novel problems. Further, our analyses are subject to selection bias, in that teachers who already teach conceptually might be more likely to seek contentfocused professional development and more likely to change their practices. Similarly, teachers who focus on advanced math content may tend to teach higherachieving students, who may grow faster in math achievement than lowerachieving students; thus initial differences among students could potentially impact the estimates of growth rates. Omitted variable bias may also exist (as it does in all nonexperimental studies); for example, teaching advanced topics might be correlated with another teaching or organizational factor (e.g., a new policy or curriculum) that is actually the causal factor. Nonetheless, our analysis provides a rare look at change over time in teaching and student achievement. It also offers an opportunity to link professional development to teaching practice to student achievement, which until recently was uncommon in education research. And although the data are from the late 1990s, they are from schools that implemented standardsbased reform early; thus the environment of the schools in our study is likely to resemble today’s accountability environment. Therefore, this is a useful population for studying how professional development, instruction, and student learning are linked. EMPHASIS ON SOLVING NOVEL PROBLEMS AND ADVANCED TOPICS FOSTERED ACHIEVEMENT GROWTH, WHILE EMPHASIS ON MEMORIZATION SLOWED GROWTH Our data analysis supported our main hypotheses in that (1) when teachers in third, fourth, and fifth grade focused more on advanced mathematics topics (defined as operations with fractions, distance problems, solving equations with one unknown, solving two equations with two unknowns, and statistics) and emphasized solving novel problems, student achievement grew more quickly; (2) when teachers focused more on basic topics (defined as measurement, rounding, multidigit multiplications, and problem solving) and emphasized memorizing facts, student achievement grew more solely; and (3) when teachers participated in professional development that focused on math content or instructional strategies in mathematics (in Year 1), they were more likely to teach in ways associated with student achievement growth—specifically, they were more likely to teach advanced topics and emphasize solving novel problems. Though the analysis supports our main hypotheses, effect sizes varied. We found quite small links to achievement—an increase of less than 1% of a standard deviation in achievement growth for emphasis on novel problems; this is a small effect in comparison to the results of similar studies we described earlier. However, our findings of 15% slower than average growth for students whose teachers focused on basic topics and 15% faster than average growth for students whose teachers focused on advanced topics, are more in line with the range of 1030% of a standard deviation reported in those similar studies. We also found that growth was 7.5% of a standard deviation slower than average for students whose teachers emphasized memorizing facts—an effect of moderate size, compared to similar studies. The effect of contentfocused professional development on fostering emphasis on advanced topics was in line with other studies, at 11% of a standard deviation; however, the increase in emphasis on novel problems, though significant, was so low—0.05% of a standard deviation—that it is unlikely to be substantively important. It is not surprising that the effect sizes we found were small to moderate. Previous work has suggested that the links we are studying do not manifest themselves in robust relationships, especially in nonexperimental designs that rely on controls, as ours does. This makes the finding on memorization all the more intriguing. We controlled for the percentage of lowperforming students in the class, as indicated by the teacher, to account for teachers who were differentially assigned to lowerachieving students. Our findings show that student achievement growth is slowed, compared to average student growth, as a student encounters a teacher who places more emphasis on memorizing facts in mathematics. Though the longitudinal nature of our study allows us to look at change, its ability to establish temporal antecedence, necessary for establishing cause, is still limited. For example, teachers who are faced with slower learners may change their teaching to focus more on memorizing facts. Another factor that may help to explain the consistently small effects of professional development is that professional development that possesses the intensity and quality required to produce substantial effects is not common (Biancarosa, Bryk, & Dexter, 2010). Our results will have to be considered alongside recent randomized experiments on the impact of professional development, which found no effects on student achievement and little effect on teachers (Glazerman et al., 2008; Garet et al., 2008). Critiques of these studies, and even the authors themselves, have set forth several explanations of why effects were not found, including (1) lack of statistical differences between the control and intervention groups in the amount and quality of professional development they received; (2) uniqueness of the sample of firstyear teachers in Glazerman et al. (2008); and (3) the findings’ short, oneyear timeline. Garet et al. (2008), though, did find that content focus and duration of professional development had significant effects on teacher knowledge and practice. In comparing studies of the effectiveness of professional development, a potentially crucial component has yet to be explicitly and empirically explored: the role of teacherdirected learning versus participation in an intervention that is externally assigned. How important is having teachers choose their own professional learning activity, so that, for example, it might match local needs (e.g., Borman, 2005), and how does this compare to studies that have examined how external policy shapes teacher learning (Desimone, Smith, & Ueno, 2006; Desimone, Smith, & Rowley, 2007; Hochberg & Desimone, 2010; Phillips, Desimone, & Smith, 2011)? This is one of several questions that must be considered as we integrate findings from various studies to increase our understanding of how and under what circumstances professional development is effective in increasing teaching and learning. CONCLUSION Here we compared teachers’ use of advanced and basic topics, emphasis on memorization and solving novel problems, and time spent on mathematics; our results are consistent with major mathematics reforms that put a premium on conceptual mathematics, seeking to foster deep understanding that allows students to transfer knowledge to novel situations. We were also able to link teaching practices with professional development, the most popular mechanism for teacher change. Previous work has suggested that contentfocused professional development holds the most promise for fostering teaching practice that boosts student achievement; we found evidence to support this. A moderate number of crosssectional studies have indicated a relationship between content focus and teacher change, but longitudinal studies documenting this change are not common (see Desimone et al., 2002). Our findings may indicate potential effect sizes to build into power analyses in experimental studies of professional development, teacher change, and student achievement. They also offer modest support for contentfocused professional development, and for teaching practice that uses more advanced topics and emphasizes solving novel problems. We hope this work will contribute to the refinement and understanding of reforms focused on teacher learning in ways that will improve classroom practice to foster better student learning. Notes 1. Of course, much can be done to prepare teachers in their preservice training. That is beyond the scope of this paper, though it has been the focus of much other work (see Wilson, Floden, & FerriniMundy, 2002). 2. The basic components of these theoretical pathways are nearly universal in theoretical notions of the trajectories of teacher learning (e.g., Borko, 2004; Ingvarson, Meiers, & Beavis, 2005), with variations that include an emphasis on context (Borko, 2004), changing the order to reflect teacher change in beliefs as a function of improved student achievement (Guskey, 2002), and acknowledgement of multiple pathways and individuality of teacher growth (Clarke & Hollingsworth, 2002). 3. Students participated in LESCP during grades 25; however, achievement tests were administered to students only in grades 35. Therefore, all analyses are restricted to students in grades 35 and their teachers. 4. Additional teachers participated in the LESCP, but our analyses reflect only the teachers used in our student analyses. We confirm our findings using this subsample of teachers by conducting the same analyses on the full sample of teachers, which yields similar results. 5. In the student achievement models, all schoollevel variables are included in the teacher portion of the model. 6. As a sensitivity test, we examined whether teachers’ instruction varied significantly by class or studentlevel achievement. We also interacted instruction with achievement in an HLM gains model, to test for interaction effects. We found no significant interactions in any of these analyses. References Alexander, K. L., Entwisle, D. R., & Olson, L. S. (2001). Schools, achievement, and inequality: A seasonal perspective. Educational Evaluation and Policy Analysis, 23(2), 171191. Angrist, J., & Lavy, V. (2001). Does teacher training affect pupil learning? Evidence from matched comparisons in Jerusalem public schools. Journal of Labor Economics, 19(2), 343369. Aquilino, W. S. (1994). Interview mode effects in surveys of drug and alcohol use: A field experiment. Public Opinion Quarterly, 58(2), 210240. Aquilino, W. S. (1998). Effects of interview mode on measuring depression in younger adults. Journal of Official Statistics, 14(1), 1529. Ball, D. L. (1993). With an eye toward the mathematical horizon: Dilemmas of teaching elementary school mathematics. Elementary School Journal, 93, 373397. Balfanz, R., Mac Iver, D. J., & Byrnes, V. (2006). The implementation and impact of evidencebased mathematics reforms in highpoverty middle schools: A multisite, multiyear study. Journal for Research in Mathematics Education, 37(1), 3364. Banilower, E., Heck, D., & Weiss, I. (2005). Can professional development make the vision of the standards a reality? The impact of the National Science Foundation’s Local Systemic Change through Teacher Enhancement Initiative. Journal of Research in Science Teaching, 44(3), 375395. Banilower, E., & Shimkus, E. (2004). Professional development observation study. Chapel Hill, NC: Horizon Research. Barr, R., & Dreeben, R., with Wiratchai, N. (1983). How schools work. Chicago, IL: University of Chicago Press. Benson, C. C., & Malm, C. G. (2011). Bring the Pythagorean theorem “full circle.” Mathematics Teaching in the Middle School, 16(6), 336344. Biancarosa, G., Bryk, A. S., & Dexter, E. R. (2010). Assessing the valueadded effects of literacy collaborative professional development on student learning. The Elementary School Journal, 111(1), 734. Borko, H. (2004). Professional development and teacher learning: Mapping the terrain. Educational Researcher, 33(8), 315. Borko, H., & Putnam, R. (1995). Expanding a teachers’ knowledge base: A cognitive psychological perspective on professional development. In T. Guskey & M. Huberman (Eds.), Professional development in education: New paradigms and practices (pp. 3566). New York, NY: Teachers College Press. Borman, G. D. (2005). National efforts to bring reform to scale in highpoverty schools: Outcomes and implications. Review of Research in Education, 29, 128. Boscardin, C. K., AquirreMunoz, Z., Stoker, G., Kim, J., Kim, M., & Lee, J. (2005). Relationships between opportunity to learn and student performance on English and algebra assessments. Educational Assessment, 10(4), 307332. Bressoux, P. (1996). The effect of teachers’ training on pupils’ achievement: The case of elementary schools in France. School Effectiveness and School Improvement, 7(3), 252279. Briars, D., & Siegler, R. S. (1984). A featural analysis of preschoolers’ counting knowledge. Developmental Psychology, 20, 607618. Burstein, L., McDonnell, L. M., Van Winkle, J., Ormseth, T., Mirocha, J., & Guitton, G. (1995). Validating national curriculum indicators. Santa Monica, CA: RAND. Cai, J., & Wang, T. (2010). Conceptions of effective mathematics teaching within a cultural context: Perspectives of teachers from China and the United States. Journal of Mathematics Teacher Education, 13, 265287. Carlisle, J., Correnti, R., Phelps, G., & Zeng, J. (2009). Exploration of the contribution of teachers’ knowledge about reading to their students’ improvement in reading. Reading and Writing, 22(4), 457486. Carnegie Forum on Education and the Economy. (1986). A nation prepared: Teachers for the 21st century. New York, NY: Carnegie Corporation. Carpenter, T. P., Franke, M. L., Jacobs, V. R., Fennema, E., & Empson, S. B. (1998). A longitudinal study of invention and understanding in children's multidigit addition and subtraction. Journal for Research in Mathematics Education, 29(1), 320. Carpenter, T. P., Fennema, E., Peterson, P. L., Chiang, C., & Loef, M. (1989). Using knowledge of children’s mathematics thinking in classroom teaching: An experimental study. American Educational Research Journal, 26(4), 499531. Carroll, J. B. (1963). A model of school learning. Teachers College Record, 64, 723733. Catrambone, R. (1994). Improving examples to improve transfer to novel problems. Memory & Cognition, 22(5), 606615. Cauley, K. M. (1988). Construction of logical knowledge: Study of borrowing in subtraction. Journal of Educational Psychology, 80, 202205. Clarke, D., & Hollingsworth, H. (2002). Elaborating a model of teacher professional growth. Teaching and Teacher Education, 18, 947967. Cobb, P., Wood, T., & Yackel, E. (1991). Analogies from the philosophy and sociology of science for understanding classroom life. Science Education, 75(1), 2344. Cobb, P., Wood, T., Yackel, E., Nicholls, J., Wheatley, G., Trigatti, B., & Perlwitz, M. (1991). Assessment of a problemcentered 2ndgrade mathematics project. Journal For Research In Mathematics Education, 22(1), 329. Cohen, D. K., & Ball, D. (1990). Policy and practice: An overview. Educational Evaluation and Policy Analysis, 12(3), 347353. Cohen, D. K., & Hill, H. C. (2001). Learning policy: When state education reform works. New Haven, CT: Yale University Press. Cohen, D. K., Raudenbush, S. W., & Ball, D. L. (2003). Resources, instruction, and research. Educational Evaluation and Policy Analysis, 25(2), 119143. Consortium for Policy Research in Education. (1998). A close look at the effects on classroom practice and student performance: A report of the fifth year of the Merck Institute for Science Education (CPRE Evaluation Report). Philadelphia, PA: Author. Cowan, R., & Renton, M. (1996). Do they know what they are doing? Children’s use of economical addition strategies and knowledge of commutativity. Educational Psychology, 16, 409422. Desimone, L. M., & LeFloch, K. (2004). Probing the ‘trickle down’ effect of standards and assessments: Are we asking the right questions? Educational Evaluation and Policy Analysis, 26(1), 122. Desimone, L. M., Porter, A. C., Garet, M., Yoon, K. S., & Birman, B. (2002). Does professional development change teachers’ instruction? Results from a threeyear study. Educational Evaluation and Policy Analysis, 24(2), 81112. Desimone, L. M., Smith, T., Baker, D., & Ueno, K. (2005). The distribution of teaching quality in mathematics: Assessing barriers to the reform of United States mathematics instruction from an international perspective. American Educational Research Journal, 42(3), 501535. Desimone, L., Smith, T., & Ueno, K. (2006). Are teachers who need sustained, contentfocused professional development getting it? An administrator’s dilemma. Educational Administration Quarterly, 42(2), 179215. Desimone, L. M., Smith, T. S., & Rowley, K. (2007). Does Policy Influence Mathematics and Science Teachers’ Participation in Professional Development? Teachers College Record, 109(5), 10861122. Desimone, L.M., Smith, T.S., Hayes, S., & Frisvold, D. (2005). Beyond Accountability and Average Math Scores: Relating Multiple State Education Policy Attributes to Changes in Student Achievement in Procedural Knowledge, Conceptual Understanding and Problem Solving in Mathematics. Educational Measurement: Issues and Practice, 24(4), 518. Desimone, L., Smith, T., & Frisvold, D. (2007). Is NCLB increasing teacher quality for students in poverty? In A. Gamoran (Ed.), Will standardsbased reform in education help close the poverty gap? (pp. 89119). Washington, DC: Brookings Institution. Desimone, L., Smith, T., & Frisvold, D. (2010). Survey measures of classroom instruction: Comparing student and teacher reports. Educational Policy, 24(2), 267329. Desimone, L. (2006). Consider the Source: Response Differences Among Teachers, Principals and Districts on Survey Questions About Their Education Policy Environment. Educational Policy, 20(4), 640676. Desimone, L. (2009). How can we best measure teacher’s professional development and its effects on teachers and students? Educational Researcher, 38(3), 181199. Desimone, L.M., & Long, D., (2010). Does conceptual instruction and time spent on mathematics decrease the student achievement gap in early elementary school? Findings from the Early Childhood Longitudinal Study (ECLS). Teachers College Record, 112(12). Dildy, P. (1991). Improving student achievement by appropriate teacher inservice training: Utilizing Program for Effective Teaching (PET). Education, 103(2), 13238. Dillman, D. A., & Tarnai, J. (1991). Mode effects of cognitively designed recall questions: A comparison of answers to telephone and mail surveys. In P. N. Beimer, R. M. Groves, L. E. Lyberg, N. A. Mathiowetz, & S. Sudman (Eds.), Measurement errors in surveys (pp. 367393). New York: John Wiley. Elmore, R. F., & Burney, D. (1997). Investing in teacher learning: Staff development and instructional improvement in Community School District #2, New York City. New York: National Commission on Teaching and America’s Future. (ERIC Document Reproduction Service No. ED 416203). Ernest, P. (1989). The impact of beliefs on the teaching of mathematics. In P. Ernest (Ed.), Mathematics teaching: The state of the art (pp. 249–254). New York: The Flamer Press. Fennema, E., Carpenter, T.P., Franke, M.L., Levi, L., Jacobs, & Empson, S.B. (1996). A longitudinal study of learning to use children’s thinking in mathematics instruction. Journal for Research in Mathematics Education, 27(4), 403434 Firestone, W., Mangin, M., Martinez, M., & Polovsky, T. (2005). Leading coherent professional development: A comparison of three districts. Educational Administration Quarterly, 41(3), 413448. Fishman, J. J., Marx, R. W., Best, S., & Tal, R. T. (2003). Linking teacher and student learning to improve professional development in systemic reform. Teaching and Teacher Education, 19, 643658. Fowler, F. J., Jr. (1995). Improving survey questions: Design and evaluation. Applied social research methods series (Vol. 38). Thousand Oaks, CA: SAGE. Franke, M. L., Carpenter, T. P., & Levi, L. (2001). Capturing teachers’ generative change: A followup study of professional development in mathematics. American Educational Research Journal, 38, 653689. Fuson, K.C. (1990). Conceptual structures for multiunit numbers: Implications for learning and teaching multidigit addition, subtraction, and place value. Cognition and Instruction, 7, 343403. Gamoran, A. (1986). Instructional and institutional effects of ability grouping. Sociology of Education, 59(4), 185198. Gamoran, A., Porter, A. C., Smithson, J., & White, P. A. (1997). Upgrading high school mathematics instruction: Improving learning opportunities for lowachieving, lowincome youth. Educational Evaluation and Policy Analysis, 19(4), 325338. Garet, M. S., Cronen, S., Eaton, M., Kurki, A., Ludwig, M., Jones, W., & Silverberg, M. (2008). The impact of two professional development interventions on early reading instruction and achievement. Washington, DC: U.S. Department of Education, Institute of Education Sciences. Garet, M. S., Porter, A. C., Desimone, L. M., Birman, B., & Yoon, K. S. (2001). What makes professional development effective? Analysis of a national sample of teachers. American Educational Research Journal, 38(3), 915945. Geary, D. C. (2001). A Darwinian perspective on mathematics and instruction. In T. Loveless (Ed.), The great curriculum debate: How should we teach reading and math? (pp. 85107). Washington, DC: Brookings Institute. Geijsel, F. P., Sleegers, P. J. C., Stoel, R. D., & Kruger, M. L. (2009). The effect of teacher psychological and school organizational and leadership factors on teachers’ professional learning in Dutch schools. Elementary School Journal, 109(4), 406427. Gersten, R., Dimino, J., Jayanthi, M., Kim, J. S., & Santoro, L. E. (2010). Teacher study group: Impact of the professional development model on reading instruction and student outcomes in first grade classrooms. American Educational Research Journal, 47(3), 694739. Glazerman, S., Dolfin, S., Bleeker, M., Johnson, A., Isenberg, E., LugoGil, J., … Ali, M. (2008). Impacts of comprehensive teacher induction: Results from the first year of a randomized controlled study. Washington, DC: U.S. Department of Education, Institute of Education Sciences. Goldschmidt, P., & Phelps, G. (2010). Does teacher professional development affect content and pedagogical knowledge: How much and for how long? Economics of Education Review, 29(3), 432439. Good, T., & Brophy, J. (2000). Looking in classrooms (8th^{ }ed.). New York: AddisonWesley. Guarino, C. M., Hamilton, L. S., Lockwood, J. R., Rathbun, A. H., & Hausken, E. G. (2006). Teacher qualiﬁcations, instructional practices, and reading and mathematics gains of kindergartners. Washington, DC: National Center for Education Statistics. Hamilton, L. S., McCaffrey, D. F., Stecher, B. M., Klein, S. P., Robyn, A., & Bugliari, D. (2003). Studying largescale reforms of instructional practice: An example from mathematics and science. Educational Evaluation and Policy Analysis, 25(1), 129. Heck, D. J., Banilower, E. R., Weiss, I. R., & Rosenberg, S. L. (2008). Studying the effects of professional development: The case of the NSF’s local systemic change through teacher enhancement initiative. Journal for Research in Mathematics Education, 39(2), 113152. Hiebert, J., & Wearne, D. (1996). Instruction, understanding, and skill in multidigit addition and subtraction. Cognition and Instruction, 14, 251283. Hiebert, J., & Wearne, D. (1986). Procedures over concepts: The acquisition of decimal number knowledge. In J. Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics (pp. 199224). Hillsdale, NJ: Erlbaum. Hiebert, J. (1999). Relationships between research and the NCTM Standards. Journal for Research in Mathematics Education, 30(1), 319. Hiebert, J., Carpenter, T. P., Fennema, E., Fuson, K., Human, P., Murray, H., … Wearne, D. (1996). Problem solving as a basis for reform in curriculum and instruction: The case of mathematics. Educational Researcher, 25(4), 1221. Hill, H. C., Ball, D. L., & Schilling, S. G. (2008). Unpacking pedagogical content knowledge: Conceptualizing and measuring teachers’ topicspecific knowledge of students. Journal for Research in Mathematics Education, 39(4), 372400. Hirsch, E. D. (2001). The roots of the Education Wars. In T. Loveless (Ed.), The great curriculum debate: How should we teach reading and math? (pp. 1324). Washington, DC: Brookings Institute. Hochberg, E., & Desimone, L. (2010). Professional development in the accountability context: Building capacity to achieve standards, Educational Psychologist, 45(2), 89106. Jacob, B., & Lefgren, L. (2004). The impact of teacher training on student achievement: Quasiexperimental evidence from school reform efforts in Chicago. Journal of Human Resources, 39(1), 5079. Kanter, D., & Konstantopoulos, S. (2010). The impact of a projectbased science curriculum on minority student achievement, attitudes, and careers: The effects of teacher content and pedagogical content knowledge and inquirybased practices. Science Education, 94(5), 855887. Retrieved from ERIC database. Kazemi, E., & Stipek, D. (2001). Promoting conceptual thinking in four upperelementary mathematics classrooms. The Elementary School Journal, 102(1), 5980. Kennedy, M. M. (1998). Form and substance in inservice teacher education (Research Monograph No. 13). Arlington, VA: National Science Foundation. Kilpatrick, J., Swafford, J., & Findell, B. (Eds.). (2001). Adding it up: Helping children learn Mathematics. Washington, DC: National Academy Press. Kouba, V. L., Carpenter, T. P., & Swafford, J. O. (1989). Number and operations. In M. M. Lindquist (Ed.), Results from the 4th Mathematics Assessment of the National Assessment of Educational Progress (pp. 6493). Reston, VA: National Council of Teachers of Mathematics. Knapp, M. (1995). Teaching for meaning in highpoverty classrooms. New York, NY: Teachers College Press. Lampert, M. (1990). When the problem is not the question and the solution is not the answer: Mathematical knowing and teaching. American Educational Research Journal, 27, 2963. Leckie, G. (2009) The complexity of school and neighbourhood effects and movements of pupils on school differences in models of educational achievement. Journal of the Royal Statistical Society: Series A, 172, 537554. LeFevre, J., & Dixon, P. (1986). Do written instructions need examples? Cognition and Instruction, 3(1), 130. Li, Y. (2008). What do students need to learn about division of fractions? Mathematics Teaching in the Middle School, 13(9), 546552. Little, J. W. (1993). Teachers’ professional development in a climate of educational reform. Educational Evaluation and Policy Analysis, 15(2), 129151. LoucksHorsley, S., Hewson, P. W., Love, N., & Stiles, K. (1998). Designing professional development for teachers of science and mathematics. Thousand Oaks, CA: Corwin Press. Loveless, T. (Ed.). (2001). The great curriculum debate: How should we teach reading and math? Washington, DC: Brookings Institution Press. Ma, L. (1999). Knowing and teaching elementary mathematics: Teachers’ knowledge of fundamental mathematics in China and the United States. Mahwah, NJ: Lawrence Erlbaum. Mayer, D. P. (1999). Measuring instructional practice: Can policymakers trust survey data? Educational Evaluation and Policy Analysis, 21(1), 2945. Mullens, J. E. (1995). Classroom instructional processes: A review of existing measurement approaches and their applicability for the Teacher Followup Survey (Working Paper No. NCESWP9515). Washington, DC: U.S. Department of Education National Center for Education Statistics. Mullens, J. E., & Gayler, K. (1999). Measuring classroom instructional processes: Using survey and case study fieldtest results to improve item construction (Working Paper No. 1999–08). Washington, DC: U.S. Department of Education National Center for Education Statistics. Mullens, J. E., Gayler, K., Goldstein, D., Hildreth, J., Rubenstein, M., Spiggle, T., … Welsh, M. (1999). Measuring classroom instructional processes: Using survey and case study fieldtest results to improve item construction. Working Paper series. Washington, DC: Department of Education. Mullens, J., E., & Kasprzyk, D. (1996). Using qualitative methods to validate quantitative survey instruments. Proceedings of the section on survey research methods. Paper presented at the annual meeting of the American Statistical Association, Alexandria, VA. Mullens, J., & Kasprzyk, D. (1999). Validating item responses on selfreport teacher surveys. Washington, DC: U.S. Department of Education. National Commission on Teaching and America’s Future. (1996). What matters most: Teaching for America's future. New York: Author. National Commission on Teaching and America’s Future. (1997). Doing what matters most: Investing in quality teaching. New York: Author. National Council of Teachers of Mathematics. (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author. National Council of Teachers of Mathematics. (2000). Principles and standards for school mathematics. Reston, VA: Author. National Council of Teachers of Mathematics. (2006). Curriculum focal points for prekindergarten through Grade 8 mathematics. Reston, VA: Author. O’Sullivan, O., & McGonigle, S. (2010). Transforming readers: Teachers and children in the Centre for Literacy in Primary Education Power of Reading Project. Literacy, 44(2), 5159. Paas, F. G. W., & van Merrienboer, J. J. G. (1994). Variability of worked examples and transfer of geometrical problemsolving skills: A cognitiveload approach. Journal of Educational Psychology, 86, 122133. Parise, L., & Spillane, J. (2010). Teacher learning and instructional change: How formal and onthejob learning opportunities predict change in elementary school teachers’ practice. Elementary School Journal, 110(3), 323346. Retrieved from ERIC database. Pellegrino, J., Baxter, G., & Glaser, R. (1999). Addressing the “Two Disciplines” problem: Linking theories of cognition and learning with assessment and instructional practice. Review of Research in Education, 24, 307353. Penuel, W. R., Fishman, B., Yamaguchi, R., & Gallagher, L. P. (2007). What makes professional development effective? Strategies that foster curriculum implementation. American Educational Research Journal, 44(4), 921958. Perry, M. (1991). Learning and transfer: Instructional conditions and conceptual change. Cognitive Development, 6, 449468. Phelps, G., & Schilling, S. (2004). Developing measures of content knowledge for teaching reading. Elementary School Journal, 105(1), 3148. Porter, A. (1995). The uses and misuses of opportunitytolearn standards. Educational Researcher, 24(1), 2127. Porter, A. C. (2002). Measuring the content of instruction: Uses in research and practice. Educational Researcher, 31(7), 314. Porter, A. C., Kirst, M. W., Osthoff, E. J., Smithson, J. L., & Schneider, S. A. (1993). Reform up close: An analysis of high school mathematics and science classrooms. New Brunswick, NJ: Consortium for Policy Research in Education. Rasbash, J., Leckie, G., Pillinger, R., & Jenkins, J. (2010) Children's educational progress: Partitioning family, school and area effects. Journal of the Royal Statistical Society: Series A, 173, 657682. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks, CA: Sage. Raudenbush, S. W., Hong, G., & Rowan, B. (2003). Studying the causal effects of instruction with application to primaryschool mathematics. Paper prepared for Research Seminar II, Instructional and Performance Consequences of Highpoverty Schooling. Washington, DC: The Charles Sumner School Museum and Archives. RittleJohnson, B., Siegler, R. S., & Alibali, M. W. (2001). Developing conceptual understanding and procedural skill in mathematics: An iterative process. Journal of Educational Psychology, 93, 346362. RittleJohnson, B., & Star, J. R. (2007). Does comparing solution methods facilitate conceptual and procedural knowledge? An experimental study on learning to solve equations. Journal of Educational Psychology, 99, 561574. Romberg, T. A. (2000). Changing the teaching and learning of mathematics. Australian Mathematics Teacher, 56(4), 69. Rosenholtz, S. J. (1989). Workplace conditions that affect teacher quality and commitment: Implications for teacher induction programs. Elementary School Journal, 89(4), 421439. Rowan, B., Harrison, D. M., & Hayes, A. (2004). Using instructional logs to study mathematics curriculum and teaching in the early grades. The Elementary School Journal, 105(1), 112127. Philips, K., Desimone, L., & Smith, T. (2011). Teacher participation in contentfocused professional development and the role of state policy. Teachers College Record, 113(11). http://www.tcrecord.org ID Number: 16145. Saxe, G. B., Gearhart, M., & Nasir, N. S. (2001). Enhancing students’ understanding of mathematics: A study of three contrasting approaches to professional support. Journal of Mathematics Teacher Education, 4, 5579. Schmidt, W. H., McKnight, C. C., & Raizen, S. A. (1997). A splintered vision: An investigation of U.S. science and mathematics education. Boston, MA: Kluwer Academic Publishers. Shavelson, R. J., Webb, N. M., & Burstein, L. (1986). Measurement of teaching. In M. C. Wittrock (Ed.), Handbook of research on teaching (3rd ed.), pp. 5091. New York: Macmillan Publishing. Schoenfeld, A. (1985). Mathematical problem solving. New York, NY: Academic Press. Shoenfeld, A.H. (1988). When good teaching leads to bad results: The disasters of ‘welltaught’ mathematics courses. Educational Psychologist, 23(2), 145166. Siegler, R. S. (1996). Emerging minds: The process of change in children’s thinking. New York: Oxford University Press. Silver, E. A. (Ed.) (1985). Teaching and learning mathematical problem solving: Multiple research perspectives. Mahwah, NJ: Lawrence Erlbaum Associates. Silver, E. A., Ghousseini, H., Gosen, D., Charalambous, C., & Strawhun, B. (2005). Moving from rhetoric to praxis: Issues faced by teachers in having students consider multiple solutions for problems in the mathematics classroom. Journal of Mathematical Behavior, 24, 287301. Singley, M. K., & Anderson, J. R. (1989). The transfer of cognitive skill. Cambridge, MA: Harvard University Press. Slavin, R. E., Madden, N. A., Karweit, N. L., Livermon, B. J., & Dolan, L. (1990). Success for all: Firstyear outcomes of a comprehensive plan for reforming urban education. American Educational Research Journal, 27(2), 25578. Smithson, J. L., & Porter, A. C. (1994). Measuring classroom practice: Lessons learned from efforts to describe the enacted curriculum—the Reform Up Close Study. New Brunswick, NJ: Consortium for Policy Research in Education, Rutgers University. Spillane, J. P., & Zeuli, J. (1999). Reform and teaching: Exploring patterns of practice in the context of national and state mathematics reforms. Educational Evaluation and Policy Analysis, 21(1), 127. Stanford Achievement Test, Ninth Edition (SAT9). Star, J., & Seifert, C. (2006). The development of flexibility in equation solving. Contemporary Educational Psychology, 31(3), 280300. Stein, M. K., & Lane, S. (1996). Instructional tasks and the development of student capacity to think and reason: An analysis of the relationship between teaching and learning in a reform mathematics project. Educational Research and Evaluation, 2(1), 5080. Stigler, J. W., & Hiebert, J. (1999). The teaching gap. New York: Free Press. Supovitz, J., & Turner, H. (2000). The effects of professional development on science teaching practices and classroom culture. Journal of Research in Science Teaching, 37(9), 963980. United States Department of Education. (2008). Foundations for Success: The final report of the National Mathematics Advisory Panel, 2008. Washington, DC: Author. von Secker, C. (2002). Effects of inquirybased teacher practices on science excellence and equity. Journal of Educational Research, 95, 151160. Wayne, A., Yoon, K. S., Zhu, P., Cronen, S., & Garet, M. S. (2008). Experimenting with teacher professional development: Motives and methods. Educational Researcher, 37(8), 469479. Wenglinsky, H. (2004). Closing the racial achievement gap: The role of reforming instructional practices. Education Policy Analysis Archives, 12(64). Retrieved from http://epaa.asu.edu/epaa/v12n64/ Westat and Policy Studies Associates. (2001a). The longitudinal evaluation of school change and performance in Title I schools, Volume 1. Washington, DC: U.S. Department of Education, Office of the Deputy Secretary, Planning and Evaluation Service. Westat and Policy Studies Associates. (2001b). The longitudinal evaluation of school change and performance in Title I schools: Volume 2, Technical Report. Washington, DC: U.S. Department of Education, Office of the Deputy Secretary, Planning and Evaluation Service. Wiley, D., & Yoon, B. (1995). Teacher reports on opportunity to learn: Analyses of the 1993 California Learning Assessment System (CLAS). Educational Evaluation and Policy Analysis, 17(3), 355370. Wilson, S., & Ball, D. (1991). Changing visions and changing practices: Patchworks in learning to teach mathematics for understanding. Research report 912. East Lansing, MI: The National Center for Research on Teacher Education. Wong, K. K, Meyer, S. J., & Shen, F. X. (2003). Educational resources and achievement gaps in high poverty schools: Findings from the Longitudinal Evaluation of School Change and Performance (LESCP) in Title I schools. Unpublished manuscript. Appendix Figure A.1. CrossClassified Model Predicting Student Math Achievement Time Varying Student Model: (1) Y_{ijkl} = π_{0jkl} + π_{1jkl }GROWTH TRAJECTORY_{ijkl} + e_{ijkl} Y_{ijk} represents mathematics achievement at time point i, for student j, who was assigned to teacher k, and is teaching in school l. π_{0jkl} is the expected math achievement for student j. π_{1jkl }GROWTH TRAJECTORY_{ijkl} is the annual rate of growth in mathematics achievement for student j (in which the trajectory takes on the following values: 0 for the 199697 school year, 1 for the 199798 school year, and 2 for the 199899 school year). e_{ijkl} is the random withinsubject residual, or the deviation of ijkl’s mathematics achievement from the cell mean, which is assumed to be normally distributed. BetweenCell Model: (2) π_{0jkl} = θ_{0} + γ_{01 }INVARIANT STUDENT BACKGROUND_{01j} + b_{00j} + β_{01}TEACHER CHARACTERISTICS_{01k} + c_{00k} + λ_{01}SCHOOL CHARACTERISTICS_{01l} + d_{00l} π_{0jkl} is the expected math achievement for student j. θ_{0} is average math achievement. γ_{01 }INVARIANT STUDENT BACKGROUND_{01j} refers to the coefficients associated with each of the student background measures we use in our analysis that do not, in our sample, vary over time: students’ race, gender, free and reduced lunch status, IEP status, and LEP status. These variables are considered rowlevel predictors within the HCM framework (Raudenbush & Bryk, 2002). b_{00j} is the random effect associated with student j during the first year of the study. β_{01}TEACHER CHARACTERISTICS_{01k} represent the coefficients associated with two teacherspecific variables (years of experience and years of experience squared), one classroom variable (percent of low performing students in math), as well as five instruction variables (minutes per day spent on math, focus on basic math topics, focus on advanced math topics, emphasis on memorizing facts, and emphasis on solving novel problems). These characteristics are measured at the teacher level and are considered columnlevel predictors within the HCM framework (Raudenbush & Bryk, 2002). c_{00k} is the random effect, or an expected deflection to the growth curve associated with encountering teacher k. λ_{01}SCHOOL CHARACTERISTICS_{01l} refers to the coefficients associated with each of the schoollevel measures we use in our analysis: school enrollment and school poverty. The school characteristics measures are presented in bold to indicate grandmean centering. d_{00l} is the random effect, or an expected deflection to the growth curve associated with encountering school l. Growth Portion of Model: (3) π_{1jk} = θ_{1} + b_{10j} + γ_{10 }INVARIANT STUDENT BACKGROUND_{10j} + β_{10}TEACHER CHARACTERISTICS_{10k} + λ_{10}SCHOOL CHARACTERISTICS_{10l} π_{1jk} is the annual rate of growth in mathematics achievement for student j θ_{1 }is the average learning rate. b_{10j} is the random effect associated with student j on their rate of growth in mathematics achievement. γ_{10 }INVARIANT STUDENT BACKGROUND_{10j} refers to the coefficients associated with each of the student background measures we use in our analysis. β_{10}TEACHER CHARACTERISTICS_{10k} include the coefficients associated with the teacher, classroom, and instruction variables in our model. λ_{10}SCHOOL CHARACTERISTICS_{10l} refer to the coefficients associated with the school variables in our model. The school characteristics measures are presented in bold to indicate grandmean centering. Figure A.2 HLM Growth Models Predicting Teaching Practices Level 1 Model (time varying teacher model): (1) Y_{ijk} = π_{0jk} + π_{1jk }GROWTH TRAJECTORY_{ijk} + π_{2jk }GRADE LEVEL_{ijk} + π_{3jk }CLASS COMPOSITION_{ijk} + π_{4jk }PROFESSIONAL DEVELOPMENT_{ijk} +e_{ijk} Y_{ijk} represents one of five dependent variables related to time spent on certain teaching practices at time point i, for teacher j, teaching in school k. π_{0jk} is the mean time spent on specific teaching practices of teacher j in school k. (The 5 dependent variables used in our analyses include the following: time spent on mathematics instruction, focus on basic math content, focus on advanced math content, use of instructional strategies related to memorizing facts, and use of instructional strategies related to solving novel problems.) π_{1jk }GROWTH TRAJECTORY_{ijk} is the annual rate of growth in the use of specific teaching practices for teacher j in school k (in which the trajectory takes on the following values: 0 for the 199697 school year, 1 for the 199798 school year, and 2 for the 199899 school year). π_{2jk }GRADE LEVEL_{ijk} represents a series of three dummy variables used to control for the grade teacher j teaches within school k: second grade, fourth grade, and fifth grade (third grade is used as the control group). π_{3jk }CLASS COMPOSITION_{ijk} represents the percentage of low performing students in mathematics during time i, for teacher j’s classroom, within school k. π_{4jk }PROFESSIONAL DEVELOPMENT_{ijk} represents the coefficients associated with participation in three types of professional development at time i for teacher k in school j. These types of professional development include participation in professional development with a content focus in math, professional development with a content focus in reading, and professional development with other foci. e_{ijk} is the random withinsubject residual, or the deviation of ijk’s use of teacher practices from the teacher’s mean, which is assumed to be normally distributed. Level 2 Model (time invariant teacher model): (2) B_{0jk} = $_{00k} + $_{01k}(YEARS OF EXPERIENCE)_{jk} + r_{0jk}, π_{0jk} is the expected time spent on certain teaching practices for teacher j. $_{00k} is average time spent on certain teaching practices in school k. $_{01k}(YEARS OF EXPERIENCE)_{jk} represents the coefficients associated with teachers’ years of experience and a quadratic term for teachers’ years of experience. r_{0jk} is the random teacher effect, or the deviation of teacher jk’s mean from the school mean, which is assumed to be normally distributed. Level 3 Model (schoollevel model): (3) $_{00k} = (_{000} + (_{001k}(SCHOOL CHARACTERISTICS)_{k} + u_{00k} $_{00k} is average time spent on certain teaching practices in school k. (_{000} is the grand mean of time spent on certain teaching practices. (_{001k }(SCHOOL CHARACTERISTICS) represents the coefficients associated with the two schoollevel variables used in our model for school k: school enrollment and school poverty. The school characteristics measures are presented in bold to indicate grandmean centering. u_{00k} is a random “school effect,” or the deviation of school k’s mean from the grand mean, which is assumed to be normally distributed.


