Evaluation is a critical element of effective educational programs. Evaluation can help SUCCEED investigators improve their project interventions and operations, and will provide information on the extent to which they are achieving their project objectives.
The purpose of this paper is to help SUCCEED project investigators develop and carry out sound evaluations. It is intended to help them answer three basic questions:
1. Why should you evaluate your project?
2. What should you evaluate?
3. How should you evaluate?
In an effort to make the primer as practical and useful as possible, we have divided the material into a series of eight sequential evaluation steps. As shown in Figure 1, these steps represent a series of decisions and actions to be taken by SUCCEED investigators, each building on the one before. Each section of the primer addresses one step, and includes three parts. First, the basic evaluation concepts and techniques are briefly explained. Second, an example is provided, illustrating the application of evaluation to an engineering education project. Finally, each section closes with a series of recommended action steps that can serve as guidelines for project investigators.

Figure 1: Steps in Project Evaluation
Evaluation concepts and techniques will be illustrated through the use of a running example, presented in shaded boxes throughout the text. In developing this example, we have drawn heavily on the experience of a project team at the Georgia Institute of Technology who are working on "Precision Teaching of an Introductory Physics E & M Course for Engineers". Where possible, we have cited experiences and results from this project. In some instances, however, we have added hypothetical data or altered the project evaluation design to illustrate a specific point, and this material should therefore not be confused with a report of the project's approach or achievements. A brief overview of the Precision Teaching (PT) project is presented below.
The Precision Teaching Project: An Overview
The goal of this project is to improve the instruction of introductory physics in the undergraduate engineering and science curricula at the Georgia Institute of Technology (Georgia Tech), and to do so based on current psychological theory. About 30% of all entering freshmen nationwide fail to graduate as engineers, and most of this attrition occurs in the first two years. An important contributor to this failure rate is the course in introductory Physics, and particularly the sections dealing with electronics and magnetism. Almost 30% of the students enrolled in the introductory Physics sequence at Georgia Tech fail to make an acceptable "C" grade. Learning physics is a complex activity that involves at least three different kinds of knowledge: declarative, procedural and conceptual. A technique known as "precision teaching" is used to help students proceed through the declarative and procedural stages of learning to the conceptual stage. The project seeks to demonstrate that concentrating on the enhancement of basics skills in electronics and magnetism (E&M) can significantly improve performance. The principal intervention developed through the project is a set of student exercises, in both pencil-and-paper and microcomputer formats, to be used as a complement to traditional course instruction.

Step 1: Defining the Purpose of the Evaluation
One common misconception about evaluation is that it is something done after a project has been implemented, by an outside group who then judges whether or not the project was effective. While this is often true in practice, it is not the only or necessarily the right way to conduct evaluation. A good evaluation plan should be developed before a project is implemented, should be designed in conjunction with the project investigators, and should ensure that evaluation serves two broad purposes. First, evaluation activities should provide information to guide the redesign and improvement of the intervention. Second, evaluation activities should provide information that can be used by both the project investigators and other interested parties to decide whether or not they should implement the intervention on a wider scale.
These two purposes correspond with two broad types of evaluation: formative and summative. The goal of formative evaluation is to improve the intervention or project. The goal of summative evaluation is to judge the effectiveness, efficiency, or cost of an intervention. For most SUCCEED projects, both types of evaluation should be conducted.
The purpose of formative evaluation is to provide information to the project team so that their intervention can be modified and improved. It focuses on whether the intervention is being carried out as planned. Formative evaluation activities can include materials and software development and beta testing, focus groups to assess students' attitudes and responses to aspects of intervention design and materials, and experimental studies to determine the effect of specific design characteristics on students' mastery and retention of concepts and skills. While some of these activities also yield data related to intervention effectiveness, their primary goal is to provide information for intervention improvement.
The purpose of summative evaluation is to produce information that can be used to make decisions about the overall success of the intervention. There are three specific and sequential types of summative evaluation questions that should be addressed for any intervention:
Intervention Efficacy
Efficacy evaluation asks the question: "Under research (ideal) conditions, can the intervention lead to the desired outcomes?" Efficacy questions assess whether an intervention is associated with improvements in students' performance when implemented in small groups, by teachers who receive special instruction, with motivational support provided for student participation.
Intervention Effectiveness
Effectiveness evaluation asks the question: "When implemented on a wider scale, under conditions similar to those that occur in regular university teaching, does the intervention continue to lead to desired outcomes?" For SUCCEED, effectiveness
questions will usually assess whether the intervention continues to be associated with improvements in students' performance when carried out under normal classroom conditions, by teachers who have not received special instruction, and without additional motivational support for participation.
Intervention Costs
Both developmental and recurrent costs associated with the intervention must be assessed. Issues here relate to the time, support, and effort required to implement the intervention both by individual faculty members and by departments. Generally, SUCCEED investigators should try to determine how much investment would be needed for another program or university to implement the intervention.
The use of a staggered approach to summative evaluation should allow one to identify and address operational difficulties in the use of the intervention. Too often, summative evaluations simply measure efficacy. If an intervention is to go beyond being a simple "pilot project", the investigators must also evaluate intervention effectiveness and cost.
THREE-STAGE APPROACH TO EVALUATION OF PRECISION TEACHING
When designing the precision teaching project at Georgia Tech, the investigators addressed the "Why Evaluate" question by organizing their evaluation activities into three stages. In stage one, the evaluation focus was on controlled studies, both formative and summative, designed to provide information that could be used to improve the educational materials and methods, and to improve their efficacy. In stage two, they focused on broader implementation of the educational intervention, or effectiveness evaluation, to demonstrate that precision teaching could improve student performance and, eventually, retention. In the third and final stage, which has not yet been implemented, they intend to evaluate the process of dissemination as the precision teaching model and materials are used by other universities, to ensure that it reaches a wide audience and is used to maximum effectiveness.
ACTION STEPS FOR SUCCEED INVESTIGATORS:

Step 2: Clarify Project Objectives
A prerequisite for evaluation is the development of a project plan with measurable objectives that are logically related to one another and to the goals and interventions defined in the project proposal. All objectives should specify what is to be done, by when. There are three types of objectives: impact, outcome, and process. Impact objectives should focus on changes in the long-term performance of engineering students that are expected to result from project activities, and should correspond to the priority goal of the project (e.g., retention of students as successful engineering professionals) as stated in the project proposal. Outcome objectives should focus on changes in knowledge, attitudes, behaviors, or availability of educational programs or supports that result from project activities, and should be directly related to the priority intervention (e.g., improved classroom teaching; extracurricular involvement of students in engineering-related projects), priority target population (e.g., entering freshmen, engineering majors), or those charged with the education of the target population (professors, graduate students, internship supervisors, etc.). Process objectives specify the actions needed for project implementation, and should correspond to the various activities (development of written or computer software materials, peer education sessions, placements in internships, training of educators, etc.) necessary to achieve the intended outcomes and impact.
The selection of project objectives is influenced by their importance to engineering education and their direct relationship to project goals; their feasibility and practicality given available resources, including the likelihood that they can be achieved within the stated time period; and their amenability to measurement and observation, including the availability of baseline information against which to assess progress.
SAMPLE OBJECTIVES FOR THE PT PROJECT
Impact Objectives
¥ To increase the proportion of engineering majors at Georgia Tech who graduate from 55% in 1992 to 75% in 1996.
¥ To increase the proportion of engineering majors in participating universities who are practicing engineers five years after graduation.
Outcome Objectives
¥ To increase the average proportion of students enrolled in the course covering "Electricity and Magnetism" who receive a passing grade from 70% in 1992 to 85% in 1996.
¥ To increase by 50% the proportion of physics faculty (professors and graduate students) who use PT principles in their courses by 1996.
Process Objectives
¥ To develop PT materials for use in undergraduate courses covering "Electricity and Magnetism".
¥ To systematically field test these materials in instructional settings and to refine/improve them based on the field test results.
¥ To conduct studies to investigate the efficacy and effectiveness of the PT approach and materials, and to improve them based on the results of the investigations.
¥ By 1996, to provide opportunities for all undergraduate Physics students to use the PT materials.
¥ By 1996, to introduce PT approach and materials to all Georgia Tech faculty providing Physics instruction.
ACTION STEP FOR SUCCEED INVESTIGATORS:
¥ Review and if necessary revise existing project objectives. Ensure that appropriate impact, outcome, and process objectives have been specified.

Step 3: Create a Model of Change
A model of change clarifies underlying assumptions about how the proposed intervention will lead to the expected outcomes and goals of the intervention. While this sounds like a simple concept, it is often the weakest element of an evaluation plan. Development of a clear and correct model of change is the most critical step in the development of a sound evaluation plan.
What is a model of change? A model of change refers to the specific set of relationships that one believes connects the intervention to the achievement of the impact objectives of the project. As an example, we can look at a project whose purpose is to produce a multi-media tutorial system for teaching chemical engineering (CE). The goals of the project might include: 1) reducing the failure rate in CE; 2) increasing student interest in CE; and 3) increasing the number of students who complete a four-year program in CE. The model should specify how the proposed interventions will lead to these goals.
A simple model of change for this project might begin with the assumption that multi-media methods are a more effective method for presenting knowledge than didactic lectures. Because multi-media methods are more effective, students will learn more, retain more, and will therefore have a higher probability of passing the course. If they pass the course, they are more likely to have an increased interest in CE. And finally, if they like CE more, they are more likely to continue in the program and graduate.

If this model reflects the assumptions underlying the proposed intervention and how it leads to achievement of the project goals, investigators should try to assess each of the proposed links in the model of change. For example, do the students who use the multi-media system learn more than those taught by the traditional lecture system? Does this result in a higher percentage of students passing the course? Does this result in students liking chemistry more? Finally, does this result in more students choosing to remain in the chemical engineering program?
It could be that the intervention does increase learning (let's say that the students develop better conceptual knowledge of chemistry due to use of interactive simulations, as reflected in their laboratory worksheets) but this knowledge may not lead to a higher percentage of students passing the course. This could occur because the course grade is based on a curve or because the exams do not tap this increased conceptual understanding. Alternatively, students could learn more, perform better in the course, but still choose to drop out of CE. This could be because even when passing the course, they do not like chemistry more. Or it could be that they like chemistry so much after participating in the multi-media intervention that they decide to change majors from CE to chemistry.
The important point here is that the set of relationships theorized to exist between the intervention and the goals of the project must be clearly defined. To the extent possible, each of the defined relationships should then be measured as part of the evaluation plan, allowing you to determine why and how the project either succeeded in reaching its goals or failed to do so. The more specific you are in developing your model of change, the more useful the information generated by the evaluation will be.
Of course, few projects have sufficient resources to assess all assumptions. They must choose which of the relationships that exist in their model to test. These choices should be based on:

The model of change developed by the PT investigators identifies the following assumptions: A) sound PT materials can be developed and E&M instructors taught to use them; B) trained instructors will ensure that students use the PT materials as intended; C) students' use of PT materials will lead to improved performance on course quizzes and better course grades; D) improved performance in E&M will lead to improved performance in later courses building on E&M concepts; and E) improved course performance will lead to an increased number of students who graduate and continue to work in a physics-related field.
In a parallel section of the model, investigators assumed that the information gained through formative evaluation of the implementation of PT (F-H)would be used to improve and refine the materials for later applications (I-J).
The instructors then selected a limited number of assumptions as priorities for project evaluation. Assumptions "C" and "F" - "I" were selected, "C" because it would yield information about whether the most important project objectives could be achieved; and "F" - "I" because the development of high-quality materials is the foundation for the remainder of the project. Assumptions "A" (that once materials are prepared, instructors can be trained in their use) and "B" (that trained instructors can ensure that students carry out the PT as intended) were not included initially, because all precision teaching sessions were implemented by trained project personnel under careful supervision. Assumption "D" will be investigated at a later time, because data on student participation and performance in later courses is readily available through the University Registrar. Assumption "E", while important, is at present unable to be investigated without a resource-intensive effort to develop tracking mechanisms for Georgia Tech alumni. While theoretically possible, it goes beyond the resources currently available to the project.
ACTION STEPS FOR SUCCEED INVESTIGATORS:
¥ Develop a model of change for your project, making it as specific and complete as possible.
¥ Review each of the assumptions (lines) in the model. Using the criteria presented above, identify the priority assumptions to be addressed through the project evaluation.

Step 4: Select Criteria and Indicators
Once measurable objectives and priority assumptions have been defined, investigators can make plans for evaluation based on specific criteria and indicators. Criteria are technical standards that can be used as the basis for making judgments about the quality of a curriculum, intervention, or other project component. For example, criteria for a curriculum might include whether it includes measurable learning objectives, or the quality of support and training provided to educators in the use of participatory learning methods.
Indicators are quantified measurements that can be repeated over time to track progress toward the achievement of objectives. Most indicators are expressed as rates or proportions, and include a numeric numerator and denominator. Selection of indicators should be based on their:
¥ validity, defined as the extent to which the indicator is a true and accurate measure of the phenomenon under study;
¥ reliability, defined as the extent to which indicator measurements are consistent and dependable across applications or over time;
¥ sensitivity, or likelihood of change within a reasonable time period and as a result of successful project implementation without undue influence of extra-project factors;
¥ ability to produce data that can be easily interpreted; and
¥ usefulness in guiding project change.
In addition, only those indicators that can be measured with available project resources should be selected.
SELECTED CRITERIA AND INDICATORS USED IN PT PROJECT
Impact
Graduation rate:
Number of engineering majors at Georgia Tech who graduate
divided by
Number of engineering majors at Georgia Tech
Professional Retention Rate:
Number of students participating in precision teaching during undergraduate study who are professionally employed in a position that requires physics skills five years after their graduation
divided by
Number of students participating in precision teaching during undergraduate study
Outcome
Course Pass Rate:
Number of students enrolled in E & M who pass
divided by
Number of students enrolled in E & M
Precision Teaching Coverage Rate:
Number of physics faculty who taught physics course during preceding semester who report using precision teaching method
divided by
Number of physics faculty who taught physics course during preceding semester
Process
Precision teaching materials for E & M developed
Precision teaching materials field-tested, revised and finalized
Efficacy studies completed and results used to refine materials
Number of students participating in precision teaching
Number of faculty participating in precision teaching
Cost
Total direct costs (excluding developmental costs) of intervention per student contact hour.
ACTION STEP FOR SUCCEED INVESTIGATORS:
¥ Define a set of indicators and criteria for project objectives (impact, outcome, and process).

Step 5: Identify Data Sources and Define How Often Indicators will be Measured
Once criteria and indicators have been defined, SUCCEED investigators must identify the best sources of data and determine how often these variables will be measured. Reports and records collected routinely by project or institutional personnel, such as class attendance reports, graduation records, SAT scores, or student performance on examinations, can be important sources of evaluation data if they are of sufficient accuracy. Where such data do not exist or are not accurate, special studies or audits may be necessary. Investigators should also explore whether data collected for other purposes or projects may be available and appropriate for use in evaluating SUCCEED activities. For example, student course evaluations conducted for other educational purposes may provide an opportunity to obtain data specific to SUCCEED project activities.
Investigators must also define how often indicators will be measured. Considerations include:
¥ the resources needed to collect data for the indicator (e.g., data from student examinations can be collected more frequently than data from a special assessment of student skills);
¥ when indicator data will be needed to guide project decision making (e.g., data should be collected, analyzed, and prepared for review before rather than after a project review exercise); and
¥ when meaningful changes in indicator levels can be expected given project activities (e.g., there is no need to measure changes in student retention rates in engineering programs if no meaningful activities have been directed at retention over the course of the project).
Data Sources and Periodicity for PT Project Indicators
The key to successful evaluation is careful organization. PT project investigators identified the data they would need to measure each of their priority indicators, and the source from which that data could be obtained. "Project records" reflects the need for a careful management information system that includes the data points needed for all monitoring and evaluation. Special evaluation studies require additional data bases, but the summary results should always be maintained in project records. As is clear in the table below, tracking (by name or ID number) of individual students who have participated in PT instruction will be essential for later follow-up studies of graduation and retention rates.
Indicator Data Source(s) Frequency
Graduation rate Registrar Annual (total and at-risk) Project records Professional retention Special alumni survey Annual rate Registrar Project records Course performance Registrar Biweekly (total and at-risk) Project records PT coverage rate University course Quarterly records Special faculty survey PT materials developed Project records Monitor until achieved PT materials Project records Monitor until field-tested, revised, achieved finalized Efficacy studies Project records Monitor until completed achieved No. students exposed to Registrar Quarterly PT Project records No. faculty exposed to Project records Quarterly PT Cost Project records Quarterly
ACTION STEP FOR SUCCEED INVESTIGATORS:

Step 6: Design Evaluation Research
The key to a good evaluation plan is the design of the study or studies to answer the evaluation questions. There are many possible research designs and plans. Your objective should be to maximize the reliability and the validity of your evaluation results.
Reliability refers to the consistency or dependability of the data. The idea is simple: if the same test, questionnaire, or evaluation procedure is used a second time, or by a different research team, would one obtain the same results? If so, the test is reliable. In any evaluation or research design, the data collected are useful only if the measures used are reliable.
Validity refers to the extent to which the questions or procedures actually measure what they claim to measure. Another way to say this is that valid data are not only reliable, but are also true and accurate. Measures used to collect data about a variable in your evaluation study must be both reliable and valid if the overall evaluation is to produce useful data.
Investigators should select a research design that controls for as many threats to validity as possible. Of course, few studies can control completely for all threats, and investigators are often constrained by cost, availability of subjects, or other factors that preclude the optimal study design. However, the key is to systematically assess possible designs based on the various threats to validity, and select the design that is most valid given other constraints. Below we will give a brief overview of three of the major threats to validity in evaluation research designs, followed by an overview of qualitative and quantitative research methods.
Common Threats to Validity
Selection. A common threat to validity occurs when the people selected for the experimental group are different from those in the comparison group. For example, suppose you want to determine if tutorial sessions will improve course performance. In seeking to answer this question you ask for volunteers from the class to participate in the tutorial sessions and then compare their performance in the course to the students who did not volunteer. The question, however, is whether the two groups of students are alike in all characteristics except for participation in the tutorial session. Perhaps better students (or more motivated students) volunteer for the extra work. Any differences in course performance may be due simply to the selection bias introduced through asking students to volunteer rather than randomly assigning students to the tutorial group. Investigators need to insure that the students in all the groups being compared on course or test performance are equal in all the characteristics that may affect performance (e.g., knowledge, skills, motivation). If this is not possible, some differences may be able to be addressed through statistical analysis.
Mortality. Mortality refers to the differential loss of students from an intervention as compared to the usual treatment group, resulting in differences between the students in the groups at the time of testing. For example, one could assign students to one of two groups, a group which spends an extra hour each week solving problems and another group that has small, one-hour discussion groups weekly. It could be that more students would drop out of the problem group than the discussion group, especially those with less motivation. If this occurs, one could end up with differences between the students in the two groups that could be the source of any differences in performance.
Hawthorne Effect. The Hawthorne effect, while not normally described as a threat to validity, is one issue that evaluators of educational interventions must consider. The Hawthorne Effect can perhaps best be explained by relating it to the concept of placebo effects. As we all know, it has been shown that when people believe they are being given an effective treatment, whether for a psychological or physical illness, they tend to improve even if the treatment is simply a sugar pill. People begin to feel or perform better because of increased motivation or self-confidence. The Hawthorne Effect is similar. It states that when one introduces a new method of performing a task and participants know that it is part of an effort to improve performance, there is a temporary gain in performance, even if the new method is no better (or even worse) than the old way of doing things. The explanation for this is that when people are told a new system will improve their performance and when they know they are being watched or evaluated, there is a tendency for increased effort and motivation that results in better performance. However, this increase in performance is only temporary. The Hawthorne Effect can seriously affect the validity of evaluation results, particularly if you are evaluating a new educational intervention.
Sound evaluation plans include study designs that control for these threats to validity. In the following section we will provide an overview of various research designs.
Research Designs
Qualitative Research. Some evaluation questions address issues that are not easily quantified or counted. Particularly in formative research, SUCCEED investigators may be interested in faculty or student attitudes about an intervention or approach, their ideas about how it could be improved, or their explanations about why they performed in a particular way. Qualitative research can help investigators understand these issues.
Qualitative research must be undertaken with the same level of methodological rigor as quantitative research. Indeed, for investigators without previous experience, we recommend that they identify an experienced qualitative researcher to provide technical assistance.
Qualitative methods that may be particularly useful include the following:
After two quarters in which the PT approach was implemented, a series of focus groups were conducted to gain a better understanding of students' attitudes about the PT intervention. Four different types of focus groups were conducted, and participants who met the criteria for each type were randomly selected for participation. The four types of group included: 1) students who participated in PT but did not receive a passing grade in the course; 2) students participated in PT and who did pass the course; 3) students who did not participate in PT; and 4) students who attended specific, recent PT sessions.
After obtaining student consent and assuring participants that their contributions would be treated as confidential, a trained project investigator led a focused discussion. All responses were audio taped.
Results obtained through the focus group discussions were used to
modify both the PT intervention itself, and the way in which it was introduced
and organized relative to the E&M course. Examples of findings and
resulting modifications include:

Non experimental designs. Non experimental designs are generally used only when one is trying to collect descriptive data. These types of studies are characterized by the absence of a control or comparison group. There are two commonly used non experimental designs in evaluation research: (1) the Posttest-Only Design and (2) the Pretest-Posttest Design.


There are several key points to note about both of these non experimental designs. First, while both can be used for descriptive purposes, neither can be used to claim that the intervention is better than any other intervention. The Pretest-Posttest Design does allow one to judge the amount of gain made by the treatment group, but you cannot attribute this change to your intervention. It could be that simply time or other events that occurred during the intervening time period caused the gains between the first and second tests. Because of these problems, non experimental designs are the designs of last choice.
Quasi-Experimental Designs. Quasi-experimental designs are studies that follow the basic structure of a true experiment, but without controlling for differences in subject selection. That is, the subjects are not randomly assigned to conditions. There are two classic quasi-experimental designs that will be discussed: time series design and nonequivalent control group design.


Time series designs are similar to non experimental pretest-posttest designs, with the added advantage of repeated measurements before and after the intervention. The primary advantage of this type of design is that it gives trend information. One can compare the changes between O3 and O4 to all other pairs of observations. If the intervention is the cause of the change (not time, or changes in subject's performance due to aging or learning in other courses) the changes between O3 and O4 should be greater than those between any other pair of observations.
The nonequivalent control group design has the advantage of providing a direct comparison group. It controls for changes that may be due to time or other causes, but does not control for subject differences. However, if the two groups are equivalent on the pretest scores, the threat to the validity of the study due to differences in subjects is somewhat reduced.
Experimental Designs. The key distinction that separates experimental designs from non- or quasi-experimental designs is the RANDOM ASSIGNMENT of subjects into the intervention groups. Random assignment helps insure that subjects in the groups will be equal before the intervention is introduced. This helps eliminate bias due to subject selection. We will briefly describe two of the more common experimental designs: the pretest-posttest control group design and the multiple intervention design.

The Pretest-Posttest Control Group Design has several advantages over the designs presented earlier. First, it provides for random assignment of students into groups, helping eliminate the threat of selection bias. Second, it provides a clear comparison group and uses a pre- and posttest design, allowing one to measure not only differential gains between groups, but also absolute gains in skills and knowledge. The only weakness in this design is that it does not control for the Hawthorne Effect.
The multiple intervention design has the advantage of controlling for threats to validity due to selection and the Hawthorne Effect. In addition, if interventions are based on theoretical understanding of how the intervention produces change, isolating individual or groups of causal variables, it can be used to identify the specific causes of any changes in learning due to the intervention.
In the multiple intervention design, the intervention groups can be systematically designed to vary on how much of the total intervention is received by students in each group. For example, if one is interested in determining the effectiveness of a multi-media tutoring system in teaching chemistry, there may be many aspects of the system that one believes will aid learning (e.g., additional simulations, structured drill). One could design the study so that one group receives the simulations only, one group the structured drill only, one group to both structured drill and simulation, and the fourth group extra chemistry problems to work. By comparing the four groups on how much chemistry was learned (e.g., exam and course grades) one could determine the relative effectiveness of drill alone, the simulations alone, the combined effect, and with the problem set group, the effect of additional time spent studying without the use of the multi-media system. Using random assignment of subjects to the groups, one has controlled for selection bias, most other threats to validity, and the Hawthorne Effect.
This use of a multiple intervention group design provides the best test of the effectiveness of the proposed intervention, yielding data on both process and outcome variables. It does this by isolating the effects of specific variables in the overall intervention. This type of design can be combined with a pretest-posttest design, yielding even more data regarding initial equivalence of groups. The use of multiple intervention groups allows one to test the independent effects of variables in a complex intervention, and provides an easy way to control for the Hawthorne Effect and time-on-task that other designs do not. This is clearly the superior study design for the evaluation of most SUCCEED projects, although it is often difficult for investigators to implement.
Sample Study Designs for the PT Project
Study 1: Monitoring of Process and Cost Indicators
An "Activity Information Sheet" was created to monitor project inputs and costs per quarter. Data elements included: number of hours by each category of project personnel (including faculty time); number of hours of student contact, by type of activity; project expenditures; and major activities. Monthly summaries by category provide estimates of progress and costs. These summaries are then used as the basis for monthly progress meetings of project investigators. Although tracking of process indicators does not fall neatly into one of the research designs presented above, it is an essential part of project activity.
Study 2: Efficacy Evaluation of PT for E&M Course
The purpose of the study was to determine whether participation in the PT intervention was associated with improved student performance in the course. The study used a non-equivalent control group design. After a brief explanation of the PT intervention, approximately 60% of the 140 students enrolled in E&M courses volunteered to participate. There were no significant differences in overall grade point average or grade in the preceding physics class (Particle Dynamics) between students who volunteered and those who did not, suggesting that selection bias was limited. Each student in the intervention group was assigned to a weekly, one-hour review session. At the first session the tutors provided an explanation of how to use the materials, and to record the time they spent working on them. The dependent measures included scores on five course quizzes and the final course grade.
Approximately 50% of the student volunteers were lost due to subject mortality (they did not complete at least 60% of the precision teaching materials).
As shown in the Figures on the next page, students in the intervention group performed significantly better on all five quizzes than did students in the control group (p<.05). As shown in Figure 2, a significantly higher proportion of students in the intervention group received course grades of "A" or B", and fewer made grades of "C", "D", or "F", than those in the control group (Chi-square: 14.52 df=8; p<.05).
The results indicated that the precision teaching intervention was associated with better student performance as reflected in quiz scores and course grades. The investigators are now conducting further analyses to determine the difficulty level of specific items in the materials, and the relationship between specific items and course performance. The results will be used to refine the materials.
Study limitations included the fact that students were not randomly assigned to either the precision teaching or control group; the fact that there was no alternative intervention to control for the Hawthorne effect; and the high drop out (mortality) rate in the intervention group. Investigators are currently working to strengthen the study design for future trials.


ACTION STEPS FOR SUCCEED INVESTIGATORS:
¥ Design evaluation research studies for key questions.
¥ Carry out studies and report the results.
¥ Where appropriate, use the results to improve project
interventions or operations.

Step 7: Monitor and Evaluate
Once project investigators have developed a plan for evaluation, the next challenge is to actually carry it out successfully. This is harder than it may seem. All too often, evaluation is forgotten amid the day-to-day pressures of project implementation, and becomes important only when reports are due or publications are being prepared. Under these conditions, the essential formative role of evaluation as a means of improving project interventions and operations is lost.
Strategies that can help ensure that evaluation activities are an integral part of the project include:
¥ establish a routine information system for the project , including inputs (time, resources), outputs (activities completed, student contact hours), and outcomes (student course grades, interim results of evaluation activities). Once established, a member of the project staff should be held responsible for keeping the information system up to date.
¥ include evaluation activities in the project budget. Too often, the costs associated with carrying out project evaluation are underestimated or even omitted in the original project budget. Once you have developed an evaluation plan, estimate the costs of implementing it and include these in your budget.
¥ hold regularly scheduled monitoring and evaluation meetings for project staff. All those who work on a SUCCEED project should be familiar with the project objectives and how they will be evaluated. Each individual should bear responsibility for documenting progress toward process objectives, and for reviewing the evaluation results as the basis for project improvement.
¥ encourage review and revision of the evaluation plan. Although the establishment of project objectives and associated indicators are a fundamental part of project planning, they may change as project interventions develop and are refined. Do not hesitate to revise aspects of the evaluation plan -- to strengthen the research designs, to select alternative indicators if the original ones are not sufficiently sensitive to project achievements, or to incorporate the results of formative research.
ACTION STEP FOR SUCCEED INVESTIGATORS:
¥ Implement your plan for project evaluation. Ensure that you have a working management information system for the project, sufficient money, time and people to carry out planned evaluation activities, and that regular meetings are held to review and use the evaluation results.

Step 8: Use and Report Evaluation Results
All evaluation is wasted unless the results are used to improve project operations or interventions. This essential step, however, is frequently overlooked. For SUCCEED projects, all reports should include not only evaluation results and reports of progress, but detailed explanations of how those results were used to reinforce, refine, or modify project activities.
Evaluation activities should be fully incorporated into the project management process (design, implement, evaluate, redesign...). Too often, insufficient time and resources are available for the redesign stage. You should schedule project activities to allow time for reviewing evaluation results and modifying project design after results become available and beforethe next iteration of the intervention begins.
The purpose of educational science is to improve educational practice. Therefore, it is essential that you use evaluation results not only to inform your project and the SUCCEED coalition, but also that you disseminate them to a wider audience. Dissemination can and should include the publication of your evaluation results in peer-reviewed journals, or presentations at professional conferences. In addition, there are a number of less formal avenues that can be used to share preliminary results and experiences.
ACTION STEPS FOR SUCCEED INVESTIGATORS:
¥ Ensure that your project timeline includes adequate time for the interim analysis and review of evaluation results, and for modification of project interventions.
¥ Publish and present your evaluation findings in appropriate journals and at relevant conferences.
Conclusions
Evaluation is essential to improve the quality and effectiveness of projects designed to improve engineering education. The first step in the development of appropriate evaluation activities is to incorporate an evaluation strategy into the project planning process.
So where to start? Most currently funded SUCCEED projects do not have the personnel or financial resources to design and implement comprehensive evaluations of their projects. A practical approach to this dilemma is to proceed incrementally, beginning with what is possible now and gradually increasing evaluation activities as the project develops. Projects should strive to evaluate a few components well, rather than several poorly or not at all. SUCCEED investigators may want to focus their short-term evaluation efforts on the most important process and outcome objectives of their projects. From an evaluation perspective, a focus on implementation and immediate outcomes is advantageous because relatively inexpensive and straightforward methods for valid assessments of student performance exist and have been used successfully to evaluate other educational interventions.
A limited set of priority indicators useful to project managers should be identified in an overall plan for evaluation. The plan should specify the data sources and how often indicators will be measured. Priority indicators will vary from project to project, based on their goals and specific objectives. Project directors should systematically select the indicators appropriate for their project as a part of the planning process. In addition, Regional or National SUCCEED managers may have uniform indicators to be collected by all projects; you should discuss the selection of indicators with your Project Officer to ensure that any key indicators are adequately covered by your evaluation plan.
For evaluation to lead to improvements in educational programs, it must be clearly defined as a part of the project activities. Investigators can increase the yield from their project evaluation activities by working collaboratively with other disciplines and with national staff of the SUCCEED project to define appropriate guidelines, evaluation questions and methods. A coordinated approach will conserve resources and allow comparisons among various approaches.
FURTHER READING ON PROJECT EVALUATION
General Readings on Evaluation
Herman JL (ed.). Program evaluation kit (nine volumes). Sage: 1987.
Rossi PH & Freeman HW. Evaluation. Sage: 1993 (5th ed.).
Quantitative Research Design and Statistics
Babbie E. The practice of social research (2nd ed.). Wadsworth: 1995.
Campbell DT & Stanley JC. Experimental and quasi-experimental designs for research. Rand McNally: 1963.
Creswell JW. Research design: Qualitative and quantitative approaches. Sage: 1994.
Fisher AA, Laing JE, Stoeckel JE & Townsend JW. Handbook for family planning operations research design (2nd ed.). The Population Council: 1991.
Sirkin RM. Statistics for the social sciences. Sage: 1994.
Qualitative Research
Denzin NK & Lincoln YS (eds.). Handbook of qualitative research. Sage: 1993.
Krueger RA. Focus groups: A practical guide for applied research . Sage: 1994.
Miles MB & Huberman AM. Qualitative data analysis: An expanded sourcebook. Sage: 1994 (2nd ed).
Strauss A & Corbin J. Grounded theory procedures and techniques . Sage: 1990.
IF YOU WOULD LIKE MORE HELP IN PROJECT EVALUATION,
you can contact your SUCCEED Center Director. The Coalition is committed to providing technical assistance in the development of an evaluation plan for your project.