A Randomized Impact Evaluation of Early Childhood Development in Rural Mozambique 2014, Follow-up
Follow-up of a community-based ECD intervention carried-out by Save the Children in 2008-2010 in rural Mozambique.
Starting in 2008, Save the Children implemented a center-based community driven preschool model in rural areas of the Gaza Province in Southern Mozambique. The project financed the construction, equipment and training for 67 classrooms in 30 communities, to provide Early Childhood Development (ECD) activities for children aged between 36 and 59 months. As part of its design, the program included an experimental impact evaluation (using Cluster-Randomized Controlled Trial) whereby the 30 intervention communities were selected at random from a pool of 76 eligible sites. Before the preschool activities initiated, a baseline survey was carried out in 2008 involving 76 communities in Gaza Province across the 3 different districts. Two years later, in 2010, the same 2,000 households participated in a mid-line survey to evaluate the impact of the program after one or two years of potential exposure to pre-school. The present data correspond to the follow-up survey that took place in 2014, namely 6-years after the beginning of the intervention when the targeted children were supposed to be in primary school. The impact evaluation has four main research questions: (1) to evaluate the efficiency of low-cost community-based preschool program in a disadvantaged rural African setting in terms of cognitive, socio-emotional skills as well as learning outcomes for the children, (2) to evaluate the effects of such intervention on school enrollment, attendance, and progress (i.e. grade promotion, repetition, dropout); (3) to assess whether parenting practices and knowledge can be durably influenced by community-based ECD program; (4) To identify potential spill-over effects of the program on health, education, productivity and labor market outcomes of siblings and parents of preschoolers. Field work was carried out from April to November 2014. In addition to household surveys and cognitive assessments of children (in literacy, numeracy and non-verbal reasoning), data from primary school directors, pre-school animators and community leaders were collected during this period. From the original 2,000 target children of the 2008 survey, more than 90% of them were successfully tracked and geo-referenced.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
- Mothers/caregivers of the targeted children (aged between 3 to 5 in 2008)
- Targeted children aged between 3 to 5 in 2008 in sampled communities
- Siblings of the target children currently living in the same household
The scope of the follow-up data collection includes:
1) Socio-economic information:
- Caregiver roster
- General information on the household
- Child listing
- Current income and job situation
- Household characteristics
- Level of education
- Pre-school attendance
- Educational expenses
2) Child assessments
- Literacy test
- Numeracy test
- Non-verbal reasoning test
3) Child Time-use
4) Primary Schools:
- Director information
- Total school attendance, drop-out, repetition
- Basic infrastructures
5) Community Leader:
- Community general information
- Level of facilities, and location.
Three districts : Bilene, Manjacaze, Xai-xai, located in Gaza Province (Southern Mozambique).
Producers and sponsors
IE Evaluation Team Member
IE Evaluation Team Member
Strategic Impact Evaluation Fund
Financing of follow-up survey
Communities sampling-process (baseline)
The design used for this impact evaluation is that of a clustered randomized control trial (C-RCT) at community levels
Stage 1: Community Eligibility.
Within the three target districts, a subset of eligible communities is identified that meets two key operational requirements for implementation of the program:
1. Population size: To qualify for the intervention, communities must have a population no less than 500 and no more than 8000 people. This range was determined as operationally feasible given the community mobilization process that accompanies the establishment of each ECD center.
2. Clusters: Management of the intervention requires that the intervention be clustered in groups of 6 treatment communities that can be served by a program staff. The definition of cluster was set set by Save the Children, based on minimum criteria of operational feasibility (distance or time traveled between sites).
The complete universe had 252 villages in three intervention districts. After applying eligibility criteria of population size and clustering, the sample was reduced to 167 villages in 11 clusters.
Stage 2: Clusters selections
The largest clusters in each district were selected for inclusion in the sample, resulting in total of 98 villages. To achieve coverage in all three distracts, it was further agreed with the NGO that the sample would include 2 clusters each in Manjacaze and Xai Xai and one cluster in Bilene
Stage 3: Community level randomization
Within clusters of communities that meet the two requirements outlined in stage 1, communities form triplets based on population size, and from each triplet a treatment community is selected at random. The two smallest villages which did not form part of a triplet were dropped. The final sample is composed of 37 treatment (7 for replacement) and 59 control villages (11 replacement), for a total sample of 96 villages. A total of 30 new intervention communities were then selected for this round of implementation through random assignment.
No replacement of communities was needed.
Child level selection :
In addition to randomization at the community level, there is exogenous variation in treatment within communities, based on rules of eligibility for Orphans and Vulnerable Children (OVC). ECD centers had a maximum of 3 class rooms with 35 students per class, for a maximum of 105 students per preschool. In the case of over-subscription of children to the ECD centers, Save the Children and the communities selected the children through a lottery system.
A total of 2,000 households with preschool age children were sampled from the 76 evaluation communities at baseline. With no household listing available at the time of the survey, a census of each community was carried out to identify households with children in the age range of 36 to 59 months. Taking the list of households with at least one child in this age range, 23 households per community were planned to be selected randomly. In addition, in 4 large treatment communities where oversubscription to the program was likely, an additional 63 households were selected, yielding a total sample of 2,000 households.
Deviations from the Sample Design
In practice, some communities did not have 23 households eligible. In this case, all eligible households were sampled while in larger communities, more households than planned were sampled. Among them 1,830 targeted children were assessed in literacy, numeracy and non-verbal reasoning.
The follow-up survey successfully tracked 1,875 households from baseline, representing 93.75% of the initial sample.
Two types of weights are included in SEQ_2014.dta. Both are at the community level and evaluate the inverse probability of selection within the community.
Variable “Weight1” has been generated using the number of eligible households (with at least one child aged between 36and 59 months) in the community (measured during the baseline fieldwork) divided by the number of households sampled in the community at baseline.
Variable “Weight2” has been generated using the size of the community as recorded in the national census in 2007 divided by the number of households sampled in the community at baseline.
Weights might or not be used for the analyses.
Dates of Data Collection
Assessment of literacy, numeracy and non-verbal reasoning
Community leader questionnaire
Data Collection Mode
Data collection was overseen by an Impact Evaluation reserach Field Coordinator from the World Bank who conducted continous spot-checks in the field as well as data entry.
Data Collection Notes
The tracking sample was carried out between April and November 2014, following 3 Phases.
Phase 1: Community Survey
a) Child and Household Survey: The household survey cover the panel of 2000 households with children that were aged 36 to 59 months at the time of the baseline survey in 2008 and who were followed in the first follow-up survey in 2010. This cohort was approximately 96 to 120 months at the time of this second follow-up survey. The survey also collected the Geo-Referenced coordinates of each target child’s current place of residence using a GPS supplied by the firm to each surveyor.
b) School Survey: A survey on the panel of approximately 76 schools in the original study areas was conducted to collect school-level data on the performance (i.e. grades) and school progression (i.e. repetition and drop-out rates) of children in Grades 1 to 5.
c) Community questionnaire: In the 46 control communities, a brief community survey was administered to assess whether any preschool activities were implemented in this community since 2008. In the 30 treatment communities, additional questions were included to assess the extent to which the Save the Children program continued to exist after 2010.
Phase 2: Tracking of All Movers
d) Tracking Sample. At the end of the Phase I (June 2014) 1,607 households with targeted children were completed, and 383 target children had moved away, or have not been identified in their original locality of residence. With the objective of minimizing biases from sample attrition, all children who moved from their last known place of residence were tracked by the survey firm to their new locality of residence and surveyed. Tracking followed the target child, even if some of his or her relatives were found in the original location. Children that have moved within the original community were not considered movers, and were located and interviewed as part of the standard field work operation under activity Phase 1a. Only children that have moved to a new locality were eligible for inclusion in the household tracking sample. If the target child passed away between 2008 and 2014 the socio-economic questionnaire was still administrated (under the consent of the caregivers). At the end of the phase 2, 210 target children were successfully tracked.
Phase 3: Intensive Tracking Sample
In a final phase of tracking all the targeted children not located in phase 2 were intensively tracked with the objective of identifying the child’s current location and completing the survey. Given budget constraint, area of tracking only included Maputo, Maputo province, Gaza province, and south of Inhambane province. Phase 3 tracked 1889 households. However 14 target children couldn’t be matched with absolute certainty with the baseline, resulting in the use of 1,875 socio-economic questionnaires.
- Socio-economic questionnaire administrated to the mother/caregiver from the current household of the targeted child ;
- Time-use of the targeted child
- Child assessment in Literacy, Numeracy and Non-verbal reasoning
- School director questionnaire
- Community leader questionnaire
- ECD Instructor questionnaire
Constructed or additional data sets
There is one variable, among both the baseline and the endline data-sets, that was constructed by means of compounding measurements. The variable is called score_real, and it was built by summing up all the right answers to the Provinha test for each student (as each right answer is worth 1pt). It can be compared to the variable score_admin, which contains the results to the test as computed by the administrators of the test at baseline, and by IFP students at endline.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Marie Hélène Cloutier
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.