A Randomized Impact Evaluation of Early Childhood Development in Rural Mozambique 2014
Starting in 2008, Save the Children implemented a center-based community driven preschool model in rural areas of the Gaza Province in Southern Mozambique. The project financed the construction, equipment and training for 67 classrooms in 30 communities, to provide Early Childhood Development (ECD) activities for children aged between 36 and 59 months. As part of its design, the program included an experimental impact evaluation (using Cluster-Randomized Controlled Trial) whereby the 30 intervention communities were selected at random from a pool of 76 eligible sites. Before the preschool activities initiated, a baseline survey was carried out in 2008 involving 76 communities in Gaza Province across the 3 different districts. Two years later, in 2010, the same 2,000 households participated in a mid-line survey to evaluate the impact of the program after one or two years of potential exposure to pre-school. The present data correspond to the follow-up survey that took place in 2014, namely 6-years after the beginning of the intervention when the targeted children were supposed to be in primary school. The impact evaluation has four main research questions: (1) to evaluate the efficiency of low-cost community-based preschool program in a disadvantaged rural African setting in terms of cognitive, socio-emotional skills as well as learning outcomes for the children, (2) to evaluate the effects of such intervention on school enrollment, attendance, and progress (i.e. grade promotion, repetition, dropout); (3) to assess whether parenting practices and knowledge can be durably influenced by community-based ECD program; (4) To identify potential spill-over effects of the program on health, education, productivity and labor market outcomes of siblings and parents of preschoolers. Field work was carried out from April to November 2014. In addition to household surveys and cognitive assessments of children (in literacy, numeracy and non-verbal reasoning), data from primary school directors, pre-school animators and community leaders were collected during this period. From the original 2,000 target children of the 2008 survey, more than 90% of them were successfully tracked and geo-referenced.
Kind of data
Sample survey data [ssd]
Three districts : Bilene, Manjacaze, Xai-xai, located in Gaza Province (Southern Mozambique).
Unit of analysis
- Mothers/caregivers of the targeted children (aged between 3 to 5 in 2008)
- Targeted children aged between 3 to 5 in 2008 in sampled communities
- Siblings of the target children currently living in the same household
Producers and sponsors
IE Evaluation Team Member
IE Evaluation Team Member
Strategic Impact Evaluation Fund
Financing of follow-up survey
Communities sampling-process (baseline)
The design used for this impact evaluation is that of a clustered randomized control trial (C-RCT) at community levels
Stage 1: Community Eligibility.
Within the three target districts, a subset of eligible communities is identified that meets two key operational requirements for implementation of the program:
1. Population size: To qualify for the intervention, communities must have a population no less than 500 and no more than 8000 people. This range was determined as operationally feasible given the community mobilization process that accompanies the establishment of each ECD center.
2. Clusters: Management of the intervention requires that the intervention be clustered in groups of 6 treatment communities that can be served by a program staff. The definition of cluster was set set by Save the Children, based on minimum criteria of operational feasibility (distance or time traveled between sites).
The complete universe had 252 villages in three intervention districts. After applying eligibility criteria of population size and clustering, the sample was reduced to 167 villages in 11 clusters.
Stage 2: Clusters selections
The largest clusters in each district were selected for inclusion in the sample, resulting in total of 98 villages. To achieve coverage in all three distracts, it was further agreed with the NGO that the sample would include 2 clusters each in Manjacaze and Xai Xai and one cluster in Bilene
Stage 3: Community level randomization
Within clusters of communities that meet the two requirements outlined in stage 1, communities form triplets based on population size, and from each triplet a treatment community is selected at random. The two smallest villages which did not form part of a triplet were dropped. The final sample is composed of 37 treatment (7 for replacement) and 59 control villages (11 replacement), for a total sample of 96 villages. A total of 30 new intervention communities were then selected for this round of implementation through random assignment.
No replacement of communities was needed.
Child level selection :
In addition to randomization at the community level, there is exogenous variation in treatment within communities, based on rules of eligibility for Orphans and Vulnerable Children (OVC). ECD centers had a maximum of 3 class rooms with 35 students per class, for a maximum of 105 students per preschool. In the case of over-subscription of children to the ECD centers, Save the Children and the communities selected the children through a lottery system.
A total of 2,000 households with preschool age children were sampled from the 76 evaluation communities at baseline. With no household listing available at the time of the survey, a census of each community was carried out to identify households with children in the age range of 36 to 59 months. Taking the list of households with at least one child in this age range, 23 households per community were planned to be selected randomly. In addition, in 4 large treatment communities where oversubscription to the program was likely, an additional 63 households were selected, yielding a total sample of 2,000 households.
Deviations from sample design
In practice, some communities did not have 23 households eligible. In this case, all eligible households were sampled while in larger communities, more households than planned were sampled. Among them 1,830 targeted children were assessed in literacy, numeracy and non-verbal reasoning.
The follow-up survey successfully tracked 1,875 households from baseline, representing 93.75% of the initial sample.
Two types of weights are included in SEQ_2014.dta. Both are at the community level and evaluate the inverse probability of selection within the community.
Variable “Weight1” has been generated using the number of eligible households (with at least one child aged between 36and 59 months) in the community (measured during the baseline fieldwork) divided by the number of households sampled in the community at baseline.
Variable “Weight2” has been generated using the size of the community as recorded in the national census in 2007 divided by the number of households sampled in the community at baseline.
Weights might or not be used for the analyses.
Dates of collection
Assessment of literacy, numeracy and non-verbal reasoning
Community leader questionnaire
Mode of data collection
Data collection supervision
Data collection was overseen by an Impact Evaluation reserach Field Coordinator from the World Bank who conducted continous spot-checks in the field as well as data entry.
- Socio-economic questionnaire administrated to the mother/caregiver from the current household of the targeted child ;
- Time-use of the targeted child
- Child assessment in Literacy, Numeracy and Non-verbal reasoning
- School director questionnaire
- Community leader questionnaire
- ECD Instructor questionnaire
Constructed or additional data sets
There is one variable, among both the baseline and the endline data-sets, that was constructed by means of compounding measurements. The variable is called score_real, and it was built by summing up all the right answers to the Provinha test for each student (as each right answer is worth 1pt). It can be compared to the variable score_admin, which contains the results to the test as computed by the administrators of the test at baseline, and by IFP students at endline.
Marie Hélène Cloutier
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.