Sampling
Sampling Procedure
Because the EQUIP-T regions and districts were purposively selected (see 'EQUIP-Tanzania Impact Evaluation. Final Baseline Technical Report, Volume I: Results and Discussion' under Reports and policy notes), the IE sampling strategy used propensity score matching (PSM) to: (i) match eligible control districts to the pre-selected and eligible EQUIP-T districts (see below), and (ii) match schools from the control districts to a sample of randomly selected treatment schools in the treatment districts. The same schools are surveyed for each round of the IE (panel of schools) and a cross section of standard 3 pupils and Standard 1-3 teachers will be interviewed at each round of the survey (no pupil panel or teacher panel).
------------------------------------------------
Identifying districts eligible for matching
------------------------------------------------
Eligible control and treatment districts were those not participating in any other education programme or project that may confound the measurement of EQUIP-T impact. To generate the list of eligible control and treatment districts, all districts that are contaminated because of other education programmes or projects or may be affected by programme spill-over were excluded as follows:
-All districts located in Lindi and Mara regions as these are part of the EQUIP-T programme but implementation started later in these two regions (the IE does not cover these two regions);
-Districts that will receive partial EQUIP-T programme treatment or will be subject to potential EQUIP-T programme spillovers;
-Districts that are receiving other education programmes/projects that aim to influence the same outcomes as the EQUIP-T programme and would confound measurement of EQUIP-T impact;
-Districts that were part of pre-test 1 (two districts); and
-Districts that were part of pre-test 2 (one district).
-------------------
Sampling frame
-------------------
To be able to select an appropriate sample of pupils and teachers within schools and districts, the sampling frame consisted of information at three levels:
-District;
-School; and
-Within school.
The sampling frame data at the district and school levels was compiled from the following sources: the 2002 and 2012 Tanzania Population Censuses, Education Management Information System (EMIS) data from the Ministry of Education and Vocational Training (MoEVT) and the Prime Minister's Office for Regional and Local Government (PMO-RALG), and the UWEZO 2011 student learning assessment survey. For within school level sampling, the frames were constructed upon arrival at the selected schools and was used to sample pupils and teachers on the day of the school visit.
-------------------
Sampling stages
-------------------
Stage 1: Selection of control districts
--------------------------------------------
Because the treatment districts were known, the first step was to find sufficiently similar control districts that could serve as the counterfactual. PSM was used to match eligible control districts to the pre-selected, eligible treatment districts using the following matching variables: Population density, proportion of male headed households, household size, number of children per household, proportion of households that speak an ethnic language at home, and district level averages for household assets, infrastructure, education spending, parental education, school remoteness, pupil learning levels and pupil drop out.
Stage 2: Selection of treatment schools
-------------------------------------------------
In the second stage, schools in the treatment districts were selected using stratified systematic random sampling. The schools were selected using a probability proportional to size approach, where the measure of school size was the standard two enrolment of pupils. This means that schools with more pupils had a higher probability of being selected into the sample. To obtain a representative sample of programme treatment schools, the sample was implicitly stratified along four dimensions:
-District;
-PSLE scores for Kiswahili;
-PSLE scores for mathematics; and
-Total number of teachers per school.
Stage 3: Selection of control schools
----------------------------------------------
As in stage one, a non-random PSM approach was used to match eligible control schools to the sample of treatment schools. The matching variables were similar to the ones used as stratification criteria: Standard two enrolment, PSLE scores for Kiswahili and mathematics, and the total number of teachers per school.
The endline survey was conducted for the same schools as the baseline and midline surveys (a panel of schools). However, the IE does not have a panel of pupils or teachers as a pupil only attends standard three once (unless repeating) and there is high teacher turnover. Thus, the IE sample is a repeated cross-section of pupils and teachers in a panel of schools.
Stage 4: Selection of pupils and teachers within schools
-------------------------------------------------------------------
Pupils were sampled within schools using systematic random sampling based on school registers. The within-school sampling was assisted by selection tables automatically generated within the computer assisted survey instruments. Per school, 15 standard 3 pupils were sampled. The parents of these 15 sampled pupils were then interviewed using the poverty scorecard instrument.
For the teacher interviews, as at midline, all teachers of Standards 1-3 who teach Kiswhaili or maths were interviewed to boost the sample size as many schools are small (as opposed to baseline where up to three teachers were sampled within each school for the interviews).
Lesson observations were not randomly sampled. Instead, one maths and one Kiswahili Standard 2 lessons were selected within each school using convenience sampling to be observed on the day of the survey.
-------------------------
Replacement sample
-------------------------
At baseline, if a selected school could not be surveyed it was replaced. In the process of sampling, the impact evaluation team drew a replacement sample of schools, which was used for this purpose (reserve list) and the use of this list was carefully controlled. Five out of the 200 original baseline sample schools were replaced during the fieldwork. At midline and endline, all of the 200 schools surveyed at baseline were visited again (no replacements).
---------------
Sample sizes
---------------
The actual sample sizes at endline are:
-200 schools (100 treatment and 100 control).
-2,999 standard 3 pupils assessed in both Kiswahili and mathematics.
-2,992 poverty scorecards were administered to the assessed pupils' parent(s).
-889 teachers who teach standards 1 to 3 Kiswahili and/or mathematics interviewed.
-196 standard 2 Kiswahili and mathematics lessons observed (treatment schools only).
-99 teacher group interviews were conducted (treatment schools only).
Note that the lesson observation and the small group teacher interview were only conducted in treatment schools, because the information generated could not be used in the impact modelling and so collecting information in control schools was not necessary.
-------------------------
Representativeness
-------------------------
The results from the treatment schools are representative of government primary schools in the 17 EQUIP-T programme treatment districts. However, the results from the schools in the 8 control districts are NOT representative because these districts were not randomly sampled but matched to the 17 treatment districts using propensity score matching.