Teacher Development Programme In-Service Training Impact Evaluation 2017, Endline Survey
The quantitative endline survey for the impact evaluation of Phase 1 (In-Service Training) of the Teacher Development Programme (TDP) is the second of two rounds of data collection. The first round (baseline) was conducted in October 2014 - January 2015 and the second round (endline) was conducted in October - November 2017.
The Teacher Development Programme (TDP) is a six-year (2013-2019) programme funded by the UK Department for International Development (DFID), with a total budget of £34 million. It seeks to improve the quality of teaching in primary schools, junior secondary schools, and colleges of education at the state level in northern Nigeria. It works through in-service training for primary teachers, reform of pre-service teacher education, and strengthening evidence-based research on teaching. This evaluation focuses on the first of these three outputs: in-service training and support for primary teachers in the three core curriculum subjects of English, mathematics, and science. The programme initially operated in three states, Jigawa, Katsina, and Zamfara, and was later extended to Kaduna and Kano. This survey covers a group of schools that were randomly assigned to receive the TDP intervention in Jigawa, Katsina, and Zamfara, and a control group of schools in the same states that did not receive the TDP intervention.
The impact evaluation has three main purposes:
-Formative - to help inform the implementation of TDP in Phase 1 of in-service activities and the design and implementation in its Phase 2;
-Summative - to help inform TDP, DFID Nigeria, and other education stakeholders if TDP's in-service teacher training activities have led to improvements in teacher effectiveness and pupil learning levels in English, mathematics, and science and technology; and
-Learning - to assess from TDP what might work for improving teacher effectiveness in Nigeria and elsewhere.
The impact evaluation is a theory-based, mixed-methods design. The endline research focuses on the efficiency, effectiveness, impact, and sustainability of the TDP (programme relevance was assessed through the baseline survey). The overarching evaluation questions for the endline research are:
-Impact: Has TDP caused changes in pupil learning in English, mathematics, and science in TDP schools? (quantitative)
-Effectiveness: Has TDP led to changes in teacher effectiveness? (qualitative and quantitative)
-Efficiency: Were TDP results achieved on time and to plan? How does TDP's organisational set-up facilitate delivery? (qualitative and quantitative)
-Sustainability: Are TDP's impacts on teacher effectiveness sustainable without further DFID support? (qualitative)
At the core of this impact evaluation is a constrained randomised design, with half of the sample schools assigned to receive the TDP intervention, while the other half was assigned to the control group. The main source of quantitative data is the sample panel survey of 330 schools in Jigawa, Katsina, and Zamfara conducted in 2014 and 2017. At each sample school, the head teacher and a sample of teachers and pupils were interviewed and tested on their subject knowledge in English, mathematics, and science; lesson observations were conducted; a teacher roster was compiled; and classroom attendance by teachers and pupils was measured. The baseline and endline surveys were timed to allow sufficient time for TDP impact on pupil learning outcomes to be feasible, while ensuring that a panel of pupils could be sampled at baseline and endline within the same school, and producing results in time for them to be of value to the final stages of the TDP.
The endline quantitative survey fieldwork was carried out by OPM Nigeria and Education Data, Research and Evaluation in Nigeria (EDOREN). The data from the endline survey available in the World Bank Microdata Catalog are from the TDP IE quantitative endline survey conducted in 2017. For the qualitative research findings and methods see the final endline report under Related Materials.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
The primary sampling units (PSUs) of the survey are TDP-eligible public primary schools in the three states Jigawa, Katsina, and Zamfara. The secondary sampling units are teachers (selected prior to the PSUs and teaching grades 1-3 in any of the three subjects: English, mathematics, or science), pupils (in grade 3 at baseline and taught English, mathematics, or science by at least one of the sample teachers, and in grade 6 at endline) and lessons taught by the selected teachers (not sampled).
See the 'Sampling' section for more details.
Version 2.1: Edited, anonymous dataset for public distribution.
Version 2.1 consists of six edited and anonymised datasets (at school, teacher interview, pupil, lesson, teacher roster and classroom level) with the responses to a small number of questions removed (see 'TDP IE List of variables excluded from endline survey datasets' provided under Technical Documents); these were removed for confidentiality purposes or because they were not needed for the analysis. Some of the datasets also contain selected constructed indicators. These constructed indicators are included either to (i) save data users time as they require complex reshaping and extraction of data from multiple sources (but they could be generated by data users if preferred), or (ii) to provide indicators that cannot be generated from the provided datasets due to the need to keep item responses from the pupil learning assessment confidential.
The endline survey administered seven different instruments, compared to five at the baseline. The two new instruments at endline are 1) the classroom attendance and 2) the teacher roster and background instruments. New modules were added to the head teacher interview, teacher interview, classroom observation, and pupil learning assessment and background questionnaires at endline.
All instruments except the Teacher Development Needs Assessment (TDNA) were administered using computer-assisted personal interviewing (CAPI). The TDNA was administered on paper to mimic real life marking of pupil tests and preparation of worksheets.
***1. HEAD TEACHER INTERVIEW***
- Number of pupils registered and number of teachers employed at the school
- Teacher attendance from school records
- Head teacher background including gender, age, years of experience, academic qualifications
- Self-reported absenteeism from school
- Current teaching activities (head teachers who teach any of the primary grades)
- Teaching and school leadership and management training
- Access to and use of TDP materials
- Frequency of TDP school support visits
- Head teacher activities, meetings, and supervision
- Head teacher workload
- Teacher motivation (head teachers who regularly teach primary classes in the current school year)
- Head teacher school leadership and management practices
- Use of mobile phone
- School infrastructure and resources
- Condition of the school's roof, outer and inner walls, windows and playground (enumerator observation)
***2. TEACHER INTERVIEW***
- Teacher background including gender, age, years of experience, academic qualifications
- Self-reported absenteeism from school
- Teaching training
- Access to and use of TDP materials
- Frequency of TDP school support visits
- Activities, meetings and supervision by the head teacher
- Current teaching activities
- Teacher workload
- Teacher motivation
- Use of mobile phone
***3. CLASSROOM OBSERVATION***
- Teacher talk, language, and actions during the lesson
- Pupil talk and activity during the lesson
- Use of praise and reprimands by the teacher
- Number of pupils present in the classroom
- Teacher actions at the end of the lesson
- Availability of general resources
- Availability and use of TDP materials
- Multi-grade teaching
***4. PUPIL LEARNING ASSESSMENT AND BACKGROUND***
- Administered to the sample pupils who were in grade 3 at baseline and in grade 6 at endline (unless they repeated a grade)
- Learning assessment in English literacy, numeracy and scientific literacy
- Pupil gender, age, language at home, household characteristics
***5. CLASSROOM ATTENDANCE***
- Teacher and pupil attendance in all classrooms in the school when classes are meant to take place. The classroom attendance was checked twice during the day: 15 minutes after the morning roll call and 15 minutes after the end of the long break
***6. TEACHER ROSTER AND BACKGROUND***
- Compilation of a roster of all teachers who currently work at the school and teach classes for grades 1 to 6 pupils, excludes teachers who teach only Religious Studies or are on long leave (3+ months).
- For the teachers in the roster: year of birth, Nigeria Certificate in Education (NCE) qualification, gender, year when the teacher started teaching at the school, and in-service training received by the teacher
***7. TEACHER DEVELOPMENT NEEDS ASSESSMENT (TDNA)***
- Subject knowledge assessment in English, mathematics, and science (head teachers and teachers)
The evaluation used the same TDNA at baseline and endline. While this would possibly have given teachers an advantage in terms of familiarity with the tests, the advantage was expected to be similar in control and treatment schools, and so it was expected that it would not create bias in the test results. Using the same TDNA simplifies analysis by ensuring comparability between baseline and endline in terms of the skills being examined, but it does involve some risk in relation to leakage of test papers.
TDP's state government partners in the three states where this evaluation was conducted (Jigawa, Katsina, and Zamfara) conducted readiness sessions in preparation for the TDNA among teachers in TDP treatment schools who were part of the evaluation. These sessions used actual copies of the TDNA, and in Katsina the head teacher of each school was able to take away a copy of the test paper. The evaluation team were unaware of this preparation in advance. As the preparation was given only to teachers in treatment schools, it is likely to bias TDNA results in favour of the treatment group. After discovering teachers were using filled-in copies of the test paper in Katsina during fieldwork, the team took two steps to avoid a recurrence:
-Stricter test environments: Teachers were required to take the test simultaneously, with all data collectors present to closely monitor the process. Cell phones, books, and other materials were disallowed in the testing venue. Additionally, other teachers were not allowed to gain access to the testing venue.
-Test revision: In the third week of fieldwork, the mathematics section of the TDNA was revised and the new version was administered to the remaining sample schools.
At the pre-analysis stage, three steps were taken to examine whether there was a bias due to the TDNA preparation:
-The analysis looked at whether TDNA scores were higher among teachers in the treatment schools during the first week of data collection - before the stricter test environment was introduced - than during the rest of the data collection. No statistically significant effect was found.
-The analysis looked at whether teachers in Katsina had a particular advantage during the first week, given that Katsina head teachers were able to take away copies of the TDNA paper from the preparation sessions. No such effect was found.
-The analysis looked at whether teachers in treatment schools had significantly lower scores when they took the revised version of the mathematics test, introduced during the third week of fieldwork. The analysis did find such an effect. Teachers in treatment schools had significantly lower mathematics scores if they took the revised version of the test, while teachers in control schools did not. This suggests a positive effect of the preparation (and a bias in the TDNA results) of 4-6 percentage points for mathematics (depending on the regression model specification used to examine this).
This suggests that mathematics scores may be biased upwards among treatment school teachers by 4-6 percentage points. There is no corresponding information for the English or science test sections of the TDNA.The potential effect of the TDNA preparation should be taken into account when using the TDNA data for the treatment group. The outcome of the impact evaluation endline analysis is that there was no significant change over time in TDNA scores in either treatment or control schools despite the TDNA preparation for the treatment group. It is possible that there would have been a worsening over time in treatment schools, and a negative effect of the TDP on TDNA scores, had it not been for the test preparation. However, given the data available, the evaluation team is only able to conclude that there was no improvement in TDNA scores and no positive impact.
Teacher subject knowledge
Subject knowledge assessment
Teacher Development Needs Assessment
The endline survey was carried out in three states in northern Nigeria: Jigawa, Katsina and Zamfara. The results are not representative at individual state-level nor at the level of the three states as a whole, rather they are representative of the treatment and control clusters in the 14 local government authorities (LGAs) selected by the TDP in these three states.
The target populations are the schools eligible for the Teacher Development Programme in treatment and control groups in the three states Jigawa, Katsina and Zamfara, and the eligible teachers and pupils within these schools. See the 'Sampling' section for details on the elligibility of teachers and pupils.
Producers and sponsors
Oxford Policy Management Ltd
Education Data, Research and Evaluation in Nigeria
Department for International Development
The aim of the sampling design was to define a valid counterfactual (control group) from which comparisons could be made with the treatment group that participates in the Teacher Development Programme (TDP). The control group does not participate in the TDP in-service training but has background characteristics that are, on average, similar to those of the treatment group that does participate in TDP in-service training. The sampling design was based on a quasi-experimental 'constrained' randomisation approach. 'Constrained' is used because certain parameters of the impact evaluation were already fixed. For example, the Local Government Authorities (LGAs) where the TDP was to operate had already been selected by the TDP in agreement with the three states covered by the impact evaluation. In addition, pre-determined groups of schools fulfilling certain criteria (described below) constitute the sampling frame -- this is in contrast to a fully randomised design where one would expect the random drawing of groups (or clusters) of schools from a list of all state primary schools in the region under study. Randomisation was conducted only in allocating groups of schools to 'treatment' or 'control' status.
Sampling frame construction
The intended size of the sampling frame was 1,008 public primary schools eligible for the TDP (504 treatment and 504 control schools) in the three states. This would constitute the target population of eligible schools from which the sample of schools would be drawn for the survey. The sampling frame was constructed through the steps described below.
Step 1: LGAs
In each of the three states, 14 LGAs where the TDP would operate had been predetermined by the TDP in agreement with each state:
-Jigawa: 14 out of 27 LGAs.
-Katsina: 14 out of 34 LGAs.
-Zamfara: 14 out of 14 LGAs.
To be eligible for the TDP, a school must have one head teacher and at least three other teachers and at least eight grade 3 pupils. In each of the 14 LGAs in each state, two sets of 12 eligible primary schools were to be selected. Schools within each set were identified according to geographical proximity to facilitate any training and periodic meetings of teachers, and to create a peer network within the locality. The two sets of schools within each LGA were meant to be seleted to be broadly similar. State Education Boards (SUBEBs) were responsible for the selection of the schools and were provided guidelines for how to do this. For example, to take into account the location of schools (rural/urban), school size in terms of pupils enrolled and number of classrooms, condition of school infrastructure, and existence of a school-based management committee (SBMC).
Step 3: Teachers
Before the sampling of schools, the Local Government Education Authority (LGEA) and head teacher from each school in the two sets (see 2. above), were required to identify three teachers in addition to the head teacher, who would potentially receive the TDP support and training. These teachers were identified using the following criteria:
-They teach classes at early grade level (grades 1 to 3); and
-They teach classes in any of the three subjecs of English, mathematics, and science.
Step 4: Random assignment of treatment/control status
After receiving the lists of the sets of eligible schools and teachers from the TDP coordinators, the impact evaluation team randomly assigned one set of schools among every pair of sets in each LGA to TDP treatment, and the other set to control status. This resulted in 42 (14 LGAs x 3 states) sets consisting of 12 schools each, for a total of 504 schools to receive the TDP, and 42 (14 LGAs x 3 states) sets consisting of 12 schools each for a total of 504 schools that would not receive the TDP. The sample treatment and control schools were then selected from these two lists respectively. In the 504 schools that would receive the TDP, all head teachers and the teachers identified in step 3 (see above) would receive TDP support and training, while the head teachers and teachers in the schools on the list of 504 'non-TDP' schools would not.
Stage 1: Selection of schools
At the first stage, schools were selected using implicit stratification by state, LGA, and treatment/control status. That is, each of the school sets (see step 4 above), was considered a stratum. Four schools were randomly selected from each of these sets. This yielded an intended sample size of 56 (14 LGAs x 4 schools) treatment schools in each state and 56 (14 LGAs x 4 schools) control schools in each state, for an intended 168 treatment and 168 control schools across the three states, and a total intended sample size of 336 schools.
Stage 2a: Selection of teachers
At each selected school:
-The head teacher and the three teachers identified during the construction of the sampling frame (see step 3 above) constituted the sample to be interviewed, with a total intended sample size of 336 head teacher interviews and 1,008 (336 schools x 3 teachers) teacher interviews.
-The three selected teachers as well as the head teachers who teach any primary classes would also be observed while teaching (classroom observations), with a total intended sample size of up to 336 head teacher classroom observations and 1,008 (336 schools x 3 teachers) teacher classroom observations.
-All head teachers and selected teachers would also be adminstered the Teacher Development Needs Assessment (TDNA) in English, mathematics, and science, for a total intended sample size of 1,344 (336 schools x 4 teachers) head teachers and teachers.
Stage 2b: Selection of pupils
At each selected school, eight pupils who started grade 3 in September 2014 and who were being taught English, mathematics, or science by at least one of the selected teachers during that term, would be randomly selected for the pupil learning assessment. The pupils were drawn from a sampling frame consisting of all eligible grade 3 pupils present on the day of the survey. Eligible pupils were those in grade 3 who were being taught by at least one of the selected teachers. The intended pupil sample size was 2,688 (336 schools x 8 pupils).
Survey longitudinal/panel design
The survey uses a panel design, that is, it aims to collect longitudinal data. This means the survey was implemented in the same sample of schools at baseline and endline, and that within these schools, it collected data from the same head teachers, teachers, and pupils over time. However, there was high head teacher, teacher, and pupil sample attrition between the baseline and endline surveys (see 'Sample attrition' section below).
1. School replacements
During the baseline fieldwork, five selected schools were found to be ineligible and one selected school could not be visited because of security concerns, these six schools were not replaced.
During the endline survey there were no school replacements because the survey uses a panel design.
2. Teacher replacements
There were three different cases of unavailability of selected teachers during the baseline fieldwork:
-A selected teacher was not present on the day of the survey due to short term absence, and data collectors attempted to re-visit the school at a later date;
-A selected school was found to be very small with fewer than four eligible teachers (including the head teacher), all teachers were interviewed when possible; and
-A selected teacher was on long leave, had been transferred, had passed away, or was unidentified, and data collectors would ask the head teacher to name a replacement teacher as per the teacher eligibility criteria (see step 3 above).
During the endline fieldwork there were no teacher replacements because the survey uses a panel design.
3. Pupil replacements
If one of the eight sampled pupils turned out to not be available, that pupil was randomly replaced with another pupil from the list of eligible pupils. If a selected school had eight or fewer grade 3 pupils present on the day of the baseline survey, all those eligible were selected for the pupil learning assessment.
During the endline fieldwork there were no pupil replacements because the survey uses a panel design.
The actual sample sizes at baseline were:
-330 schools (intended 336);
-330 head teacher interviews (intended 336);
-2,575 grade 3 pupils (intended 2,688);
-908 teacher interviews (intended 1,008);
-1,070 classroom observations (intended up to 1,344 depending on how many head teachers teach any primary classes); and
-1,158 Teacher Development Needs Assessments (intended 1,344).
The actual sample sizes at endline were:
-330 schools (intended 330 in actual BL sample);
-329 non-panel (new head teachers were interviewed) and 134 panel head teacher interviews (intended 330 in actual BL sample);
-1,566 panel grade 6 pupils (intended 2,575 in actual BL sample);
-447 panel teacher interviews (intended 908 in actual BL sample);
-460 panel and 574 non-panel classroom observations (intended 1,070 in actual BL sample); and
-556 panel and 774 non-panel Teacher Development Needs Assessments (intended 1,158 in actual BL sample).
There was substantial head teacher, teacher, and pupil sample attrition between baseline and endline. The main reasons for head teacher and teacher attrition were transfer to another school and for pupil attrition dropping out of school.
Sample attrition rates since baseline:
-Head teacher interviews 0.3% non-panel sample and 59% panel sample;
-Pupils 39% (in grade 3 at baseline and in grade 6 at endline);
-Teacher interviews 51%;
-Classroom observations 57%; and
-Teacher Development Needs Assessments 52%.
NOTE: Head teachers that were new at a sample school were interviewed at endline yielding a sample of 329 head teachers (some non-panel and some panel), and a panel sample of 134 head teachers (only head teachers who were head teachers at the same school at baseline and endline).
To reduce teacher and pupil attrition due to temporary absences on the day of the survey, two main steps were taken:
-School revisits: state coordinators and field teams revisited schools in order to conduct missing interviews for teachers and pupils that were unavailable on the original day of visit; and
-Calling pupils from home: for the pupils that lived close to the school but were absent from school on the day of the visit, data collectors worked with the head teachers and teachers to ask pupils to come to the school to take the pupil learning assessment if able to do so.
1. Overall pupil and teacher sample attrition
For the purposes of this impact evaluation, attrition for two units of analysis were of particular relevance: pupils and teachers. To understand attrition dynamics for those units of analysis in more detail, this evaluation assessed which characteristics of pupils and teachers at baseline help explain whether individuals dropped out of the sample or not since baseline. This means that baseline data were used to compare the group of individuals for which data were also collected at endline (the non-attriters) to the group of individuals for which data were not collected at endline (the attriters). The purpose of this analysis is to understand whether estimates of characteristics of pupils or teachers at endline can generally be thought of as being produced on a sample that is comparable to the original group of individuals sampled at baseline. The attrition analysis and statistics are produced taking into account the full sampling structure of the data (weights, clustering, stratification). Test statistics are also corrected fur multiple hypotheses testing given that these tables are comparing many different indicators at the same time.
The baseline variables used to examine sample attrition are of three types:
-Pupil and teacher background characteristics, for example, gender;
-Pupil and teacher outcomes that the TDP seeks to influence, for example, pupil test scores and teacher subject knowledge scores; and
-School characteristics that are likely to be correlated with pupil and teacher behaviour and outcomes, for example, school size.
The results for pupils indicate that pupils who drop out of the sample are mainly older and poorer than pupils who stay in the sample. The results for teachers indicate that teachers who dropped out of the sample since baseline were significantly more likely to have Nigeria Certificate in Education (NCE) qualification or higher, and performed significantly better on the Teacher Development Needs Assessment (TDNA) than teachers who remained in the sample. These results indicate that the evidence for selective attrition is weak among pupils but slightly stronger for teachers. This means that taking overall attrition into account, the sample of pupils is still comparable to the original sample. While estimates generated using the teacher sample needs to be interpreted taking teacher attrition into account, and cannot necessarily be assumed to be representative of the target population of teachers at baseline.
2. Differential sample attrition for treatment and control pupils and teachers
This impact evaluation also examined whether there is differential attrition, which refers to situations where the background characteristics of individuals who drop out between survey rounds differ significantly between the treatment and control groups. This would mean that, after the sample attrition, the two groups are not comparable anymore and that the original assumption of the control group being an appropriate counterfactual to the treatment group for impact identification purposes is no longer correct. The analysis examining key characteristics and outcomes for non-attrited pupils and teachers at baseline across the TDP treatment and control groups indicates that differential attrition is generally not problematic. The estimates suggest that there are no significant differences between the control and treatment pupils who remained in the sample. For teachers there is also very limited indication of differential attrition.
The sample for this impact evaluation was selected to represent the eligible schools in the TDP clusters and corresponding control clusters for the three states. Therefore the sample is not designed to be representative at the state level and should not be treated as such. However, the survey does include schools with a wide range of characteristics, and statistics are likely to be broadly similar to those for the states as a whole.
See 'Sampling Procedure' section.
***NOTE: THIS SECTION SHOULD BE READ IN CONJUNCTION WITH SECTION 3 OF THE 'Teacher Development Programme - Endline Evaluation (Volume II)' AND THE TECHNICAL DOCUMENT 'Quantitative Analysis: Samples, Weights and Survey Settings' BOTH OF WHICH ARE PROVIDED UNDER RELATED MATERIALS***
In order to make inferences from the TDP baseline survey data, appropriate survey weights were calculated for each sample. The weights are equal to the inverse of the overall sampling probabilities taking into account each stage of selection. Given that the endline survey was conducted as a panel survey in 2017 and as a result all of the responding sample schools, pupils and teachers from the TDP Baseline Survey were included in the endline survey, the basic probabilities and weights are the same. However, it is necessary to adjust each set of baseline weights for the sample pupils, teachers and head teachers based on the attrition for the individual tests, interviews and observations in the endline survey. A comprehensive and detailed explanation of the construction of the baseline and endline survey weights can be found in Section 3 of 'Teacher Development Programme - Endline Evaluation (Volume II)' provided under Related Materials.
Note that at endline, the sample of schools as well as the sample of pupils and teachers interviewed/observed/tested represents both the cross-sectional and panel samples. That is because all schools, pupils and teachers included in our sample at endline are also included in the baseline sample (the sample of schools is exactly the same at baseline and endline, while the sample of pupils and teachers at endline is a subset of that sample at baseline due to attrition). For head teachers however, there are two different samples: 1) a cross-sectional sample that includes all head teachers who were interviewed/observed/tested at endline (this includes head teachers who were also inlcuded in the baseline sample as well as new head teachers who became head teachers of the school after the baseline survey) and 2) a panel sample that only includes those head teachers who were interviewed/observed/tested at both baseline and endline.
For the endline survey 12 sets of weights were calculated for the different samples. These are:
1. Endline school weight (for panel AND cross-sectional analysis of schools at endline; this is the same as the baseline school weight)
2. Endline head teacher interview weight (for cross-sectional analysis of all HTs at endline)
3. Panel head teacher interview weight (for panel analysis of HTs)
4. Panel head teacher motivation weight (for panel analysis of the motivation module for HTs)
5. Endline (panel) teacher interview weight (for panel AND cross-sectional analysis of teachers at endline)
6. Endline (panel) P6 pupil test weight (for panel AND cross-sectional analysis of pupils at endline)
7. Endline (panel) teacher observation weight - all lessons (for panel AND cross-sectional analysis of teacher observations at endline)
8. Endline (panel) teacher observation weight - excluding lessons that lasted < 9 minutes (for panel AND cross-sectional analysis of teacher observations at endline)
9. Panel head teacher observation weight - all lessons (for panel analysis of HT observations)
10. Panel head teacher observation weight - excluding lessons that lasted < 9 minutes (for panel analysis of HT observations)
11. Endline (panel) teacher tests (TDNA) weight (for panel AND cross-sectional analysis of teachers at endline)
12. Panel head teacher tests (TDNA) weight (for panel analysis of HTs)
The public use datasets include all these different sets of weights as well as the variables that account for stratification, clustered sampling and finite population corrections. Additionally, the datasets contain constructed variables that help identify the correct samples for each set of weights. For example, the panel head teacher interview weight must only be applied to the sample of panel head teachers. In the school level dataset, the dummy variable 'n_panel_int' identifies the sample of panel head teachers: it takes the value 1 if the head teacher was interviewed at baseline and endline (i.e. is in the panel sample) and 0 otherwise.
The technical document 'Quantitative Analysis: Samples, Weights and Survey Settings' provided under Related Materials lists the weight variable, sample identification variable and survey setting that must be used for each level/type of analysis.
Dates of Data Collection
Data Collection Mode
Computer Assisted Personal Interview [capi]
Quality control and data checking protocols
Several mechanisms were put in place in order to ensure high quality of the data collected during the endline survey.
Selection and supervision of data collectors
Each enumerator was supervised by the training team during the training, piloting, and first week of data collection. This allowed a well-informed selection of enumerators and their allocation to roles matching individual strengths and weaknesses.
CAPI built-in routing and validations
One important quality control mechanism in CAPI surveys is the use of automatic routing and checking rules built into the CAPI questionnaires that flag simple errors during the interview, that is, early enough for them to be corrected during the interview. In addition to having automatic skip patterns built into the design in order to eliminate errors resulting from wrong skips, the CAPI validations also checked for missing fields, out-of-range values, and inconsistencies within instruments. The latter checks if any related information collected in different questions of the instrument are consistent. A warning or error message was given if an entry was out of range, inconsistent, or left empty. The data collector would then try to understand why a warning or error message was showing up and reconfirm the information with the respondent.
Live interviews were observed by state coordinators, the field manager, and members of the fieldwork management team. Any errors detected during observations were noted and discussed with the teams at the daily de-brief.
Daily and weekly reporting from the field
At the end of each working day, supervisors collected all interview files from their team members and transmitted the data to the data manager. The supervisors also sent their daily achievements to a WhatsApp group that was created for the survey. These reports were checked for consistency, completeness, and correctness by the field management team and they were cross-checked with the data received by the data manager. Any missing or inaccurate data identified were communicated to the data collection team. Additionally, a Google tracking sheet was developed that was used by the teams at the end of each work day to fill in their achievements and comments for each school. The information provided in the Google sheet was cross-checked with the information provided on WhatsApp to ensure accuracy. Whenever there were discrepancies, the survey management team contacted the state coordinators to clarify. At the end of each working week, the state coordinators collated all achievements and challenges recorded by their teams over the course of the week and shared those with the field management team. This allowed the field management team to keep track of weekly achievements and to ensure that there were no missing data.
An Excel dashboard was created by the fieldwork management team to track the uploaded data. This information was cross-checked with the Google tracking sheets. The dashboard was also used to check any inconsistent or missing data. In the event of missing data, the field team was informed, and revisits were conducted to ensure data completeness.
Secondary consistency checks and cleaning
The evaluation team exploited a key advantage of CAPI surveys, the immediate availability of data, by running a range of secondary consistency checks across all data on a daily basis in Stata. Data received from the field were exported to Stata the following day, and a range of do-files were run to assess the consistency and completeness of the data, and to make corrections if necessary. The checks comprised the following:
-Completeness and ID uniqueness: during this process, the data manager ensured that all the data reported in the daily field update were consistent with the data captured and sent in by the teams. Unique identification in each dataset and sound linkage between the datasets were also paramount and had to be checked on a daily basis.
-Consistency and out-of-range checks: a range of consistency and out-of-range checks that had not been included in the CAPI instruments were programmed into a checking Stata do-file. The data manager ran the checking do-file on a daily basis on the latest cleaned data. This returned a list of potential issues which the data manager would investigate, undertaking the necessary cleaning actions, if any. On a daily basis, all errors flagged were collated and shared with the survey management team in the field, as well as with the state coordinators and supervisors, so that the errors could be discussed with the data collectors. The purpose of these errors was to monitor the performance of data collectors and provide them with feedback to help them improve.
Data Collection Notes
Oxford Policy Management's (OPM) Nigeria office conducted the endline survey of the TDP impact evaluation.
The fieldwork was led by the OPM Nigeria office, with support from OPM Oxford. The fieldwork management team comprised six members, including a project manager, fieldwork managers, data manager, and survey coordinators. The team also included several members with very strong computer programming skills in the software (CSPro) in which the questionnaires were administered. The overall project manager for the impact evaluation, who is responsible for the content of the questionnaires, worked closely with the fieldwork team during pre-testing, training, and piloting. 61 trainees were invited to the training, who at the completion of training were assigned into their respective roles of state coordinators, supervisors, and enumerators.
The early fieldwork preparation consisted of pre-testing and refining the questionnaires and protocols, developing the fieldwork manual, and training and piloting.
A full pre-test of all questionnaires and protocols took place from 18 September to 5 October 2017 in Kaduna State. Members of the OPM fieldwork management team, as well as six data collectors, who would later become the state coordinators during the fieldwork, conducted the pre-test. The first seven days were dedicated to training the data collectors while in the remaining days 16 schools in eight LGAs were visited to administer and test all the questionnaires. The primary objectives of the pre-test were to test the changes to the baseline questionnaires that were made during the questionnaire development phase at endline, and to test the new pupil learning assessment questionnaire (at baseline grade 3 pupils were assessed, while at endline those same pupils, who were now in grade 6 were assessed and as a result a new learning assessment questionnaire was needed). The pre-test resulted in the refinement of the questionnaires and data collection protocols, as well as the improvement of the instrument programming in CAPI.
Using the baseline fieldwork manual as a basis, an extensive fieldwork manual was developed that covered an introduction to the TDP, a description of the fieldwork management and data collection teams, basic guidelines on behaviour and ethics, the use of computer assisted personal interviewing (CAPI), instructions on fieldwork plans and procedures, and a detailed description of all instruments and protocols. The manual was updated on an ongoing basis during the training and pilot phase, where updated conventions or additional clarifications were needed. The final version of the manual was printed at the end of the pilot phase and copies were provided to the field teams.
Training and pilot
Data collection training and a field pilot took place from 9 to 21 October 2017. In order to maximise training efficiency and minimise distractions to trainees, the training was conducted in-house at a hotel in Kaduna City, Nigeria. A total of 61 trainees participated in the training. The training was delivered by the fieldwork management team and members of the impact evaluation team. The main objective of the training was to ensure that data collectors would be able to master the questionnaires, understand and correctly implement the fieldwork protocols, and be able to comfortably use CAPI. Supervisors were furthermore trained on their extra responsibilities of data management, fieldwork and financial management, supervision of enumerators, and logistical tasks.
The training combined a variety of methods, including PowerPoint presentations, group sessions, mock interviews, role-play, and in-class scenarios to ensure that the training was comprehensive and interactive. The performance of trainees was assessed on an ongoing basis. Participants were quizzed at the beginning of each day to assess their level of understanding of the information they received the previous day, and to inform the training facilitators on areas where participants had knowledge gaps. Furthermore, participants were given daily evaluation forms in order to obtain their feedback on the day's training, with the aim of learning how facilitators could improve their delivery of the training.
Over the course of the training, two pilot surveys were conducted which provided a full-team rehearsal. The trainees were closely observed by the training facilitators, who assessed their understanding of the instruments as well as their ability to interact with the respondents, code responses appropriately, and use CAPI and the show cards for helping respondents identify the correct training received confidently.
At the end of the training and pilot phase, participants were assigned to their roles as supervisors and enumerators based on their language proficiency, level of understanding of the survey instruments, and its administration. Those who demonstrated desirable leadership and people management skills, in addition to mastery of the instruments and protocols, were appointed team supervisors.
A higher number of data collectors than needed for data collection were invited to and attended the training. This allowed for a selection of the best suited candidates at the end of the training and provided a pool of reserve additional trained staff that could be called upon in case of enumerator attrition during data collection.
Data collection commenced on 24 October and ended on 17 November 2017. The field teams completed the survey in all 330 schools that were visited at baseline in Katsina, Jigawa, and Zamfara. All interviews and assessments were administerd in Hausa, except for sections of the pupil learning assessment and Teacher Development Needs Assessment (TDNA) that assess English skills.
For the first two days of fieldwork, the fieldworkers were collapsed into three teams per state. The state coordinators and fieldwork management team worked closely with the teams to make sure that data collectors were confident and were coding accurately. When it was confirmed that they could work independently, the data collectors were split into six data collection teams per state, with each team composed of one supervisor and two enumerators. Each team completed a school visit in one day.
There were two state coordinators in each of Katsina and Zamfara states, while there were three state coordinators for Jigawa. The state coordinators provided leadership in each state to ensure successful and high-quality fieldwork implementation. State coordinators were responsible for devising implementation plans for their assigned states, and for managing state teams and other survey resources. In addition to these roles, they provided technical support to state teams, and supported the supervisors to perform their roles: for example, by working with their various teams to address data quality issues identified by the data management team on a day-to-day basis.
Additionally, members of the fieldwork management team were present in every state to provide administrative and technical support, supervision and mentoring, while the data management and IT team provided continuous back-end support to field teams.
Oxford Policy Management Nigeria
See 'Scope' section.
Given the data was electronically collected, it was continually checked, edited and processed throughout the survey cycle.
A first stage of data checking was done by the survey team which involved (i) checking of all IDs; (ii) checking for missing observations; (iii) checking for missing item responses where none should be missing; and (iv) first round of checks for inadmissible/out of range and inconsistent values. See section 'Supervision' for more details. Additional data processing activities were performed at the end of data collection in order to transform the collected cleaned data into a format that is ready for analysis. The aim of these activities was to produce reliable, consistent and fully-documented datasets that can be analysed throughout the survey and archived at the end in such a way that they can be used by other data users well into the future. Data processing activities involved:
- Computing and merging in the sampling weights,
- Reshaping datasets in order to produce data files for each unit of observation (school, teacher interview, pupil, lesson, teacher roster, classroom),
- Anonymising data by removing all variables that identify respondents such as names, address, GPS coordinates, etc.,
- Classifying non-response and coding them using a pre-determined classification scheme,
- Reviewing 'Other (specify)' responses by checking if any of the responses actually fall into existing response categories and can be recoded into the existing category or if there are multiple similar other responses that warrant the creation of a new response category (a decision to be made by the data analysts), and
- Properly naming and labelling the variables in each dataset.
The datasets were then sent to the analysis team where they were subjected to a second set of checking and cleaning activities. This included checking for out of range responses and inadmissible values not captured by the filters built into the CAPI software or the initial data checking process by the survey team.
A comprehensive data checking and analysis system was created including a logical folder structure, the development of a detailed data analysis guide and template syntax files (in Stata), to ensure data checking and cleaning activities were recorded, that all analysts used the same file and variable naming conventions, variable definitions, disaggregation variables and weighted estimates appropriately.
Because computer assisted personal interviewing (CAPI) was used to collect the data there was no data entry except for the teacher development needs assessment (TDNA), which was administered on paper. For this instrument, enumerators were trained to mark the TDNAs using the provided marking scheme and input the TDNA marks for each assessment item/question into an excel file.
The datasets have been anonymised and are available as a Public Use Dataset. They are accessible to all for statistical and research purposes only, under the following terms and conditions:
1. The data and other materials will not be redistributed or sold to other individuals, institutions, or organisations without the written agreement of Oxford Policy Management Ltd.
2. The data will be used for statistical and scientific research purposes only. They will be used solely for reporting of aggregated information, and not for investigation of specific individuals or organisations.
3. No attempt will be made to re-identify respondents, and no use will be made of the identity of any person or establishment discovered inadvertently. Any such discovery would immediately be reported to Oxford Policy Management Ltd.
4. No attempt will be made to produce links among datasets provided by Oxford Policy Management Ltd, or among data from Oxford Policy Management Ltd and other datasets that could identify individuals or organisations.
5. Any books, articles, conference papers, theses, dissertations, reports, or other publications that employ data obtained from Oxford Policy Management Ltd will cite the source of data in accordance with the Citation Requirement provided with each dataset.
6. An electronic copy of all reports and publications based on the requested data will be sent to Oxford Policy Management Ltd.
The original collector of the data, Oxford Policy Management Ltd, and the relevant funding agencies bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Oxford Policy Management. Nigeria Teacher Development Programme In-Service Training Impact Evaluation Endline Survey (TDPIE-EL) 2017, Version 2.1 of the public use dataset (August 2018).Ref. NGA_2017_TDPITCIE-EL_v01_M. Downloaded from [url] on [date]
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
(c) 2018, Oxford Policy Management Ltd
DDI Document ID
Oxford Policy Management Ltd
Pettersson Gelander, Gunilla
Quantitative Education Lead
Date of Metadata Production
DDI Document version
Version 1 (August 2018)
Version 1.1 - Identical to OPM's version 1 (Aug 2018) with IDs changed for publication in World Bank Microdata Catalog