Social Impact (SI) is conducting an impact evaluation of the MCC Tanzania Water Sector Project. The impact of the WSP will be assessed through a rigorous, quasi-experimental impact evaluation design that combines a difference-in-differences (DD) approach with generalized propensity score matching (GPSM), also called continuous propensity score matching. GPSM is an extension of traditional propensity score matching which facilitates the evaluation of the impact of continuous rather than binary treatment. The design reflects particular characteristics of the Tanzania WSP. First, the impacts of the upgraded water infrastructure are expected to be diffuse in each city; therefore, identifying a counterfactual through experimental methods is not feasible. Further, the main treatment is considered to be exposure to an increased supply of water due to the Water Sector Project infrastructure upgrades, and households will be affected differentially depending on their starting conditions (e.g. availability of water) and their position along the distribution grid. Thus, a continuous treatment approach is needed to measure the impacts of incremental increases in water supply. The GPSM technique (which will be carried out after the completion of end-line data collection) enables comparisons of outcomes between similar households that experience varying levels of improvements to water supply due to the intervention. The evaluation questions to be answered address a range of topics, including: the project's impact on water supply, access to water, and water quality; the project's impact on water consumption, water-related illness, and investment in human capital; differences in project impact by gender and socioeconomic status; the project's effect on businesses, schools, and health centers; project implementation; unintended consequences of the project; and the sustainability of the project over time. In addition to the main analysis described above, additional qualitative, direct observation (e.g. water quality tests), secondary data review, and geospatial data collection components were incorporated to facilitate comprehensive, context-specific responses to these evaluation questions.
Kind of data
Sample survey data [ssd]
Anonymized dataset for public distribution
Propensity Score Matching
Urban municipalities of the cities of Dar es Salaam (Ilala, Kinondoni, and Temeke) and Morogoro (Morogoro Urban)
Unit of analysis
Main analysis: households and individuals. (Some analyses using water quality or supply data are done at the cluster level (enumeration areas).
Qualitative analyses used data collected from community members, project stakeholders, enterprises, health centers, and schools.
The household and phone surveys were administered to one respondent per household, and collected information corresponding to the household as well as to each current household member (usual residents). The water quality tests were administered to up to two sources per cluster (either household tap, or other shared source in the cluster). The qualitative components included focus group discussions of residents across each city, semi-structured interviews of community-level water sector stakeholders, and key informant interviews of key project stakeholders.
Producers and sponsors
Social Impact, Inc.
Millennium Challenge Corporation
Households were sampled from both cities using a two-stage cluster sampling methodology, with stratification in Dar es Salaam by the current water supply to an area. Clusters were defined as census enumeration areas (EAs). The sample frame for clusters was an inventory of enumeration areas used for the 2012 census in Tanzania, obtained from the Tanzania National Bureau of Statistics (NBS). The required sample size was 5008 households (8 households from 626 clusters), split between the two cities evenly. In Morogoro, 313 clusters were randomly sampled from the master inventory. In Dar es Salaam, with the availability of information about water supply by ward, clusters were chosen by stratified random sampling, out of 5 strata corresponding to different levels of current water access through the public distribution network. Selection of clusters was done using a random number generator in Stata 12 software. After selecting 313 clusters in each city, maps were obtained from the NBS. For each of the selected clusters, listing teams worked with local community representatives to enumerate all households in each EA and generate a complete sample frame of households. From each cluster's household list, 8 households per cluster (EA) were randomly selected for the household survey using a unique random number table for each cluster; additional households from the list could be accessed in order to replace households as needed due to non-response. After the households were interviewed, a sub-set of eligible households were selected for water quality testing (up to two per cluster). Following the household survey, the full household sample was included in three rounds of a follow-up survey administered by phone, by the EDI team.
Deviations from sample design
If the listing team encountered any EAs in either city that had been demarcated strictly as an institution (e.g. hospital, school, jail) with only staff residing in the cluster, but had not been previously excluded from the sample frame, that EA was replaced by the next eligible EA from the list based on its random number, and the institutional cluster was excluded from the sample frame altogether. If community members or local officials declined to be involved in the surveying for any reason, that cluster was replaced. No deviations were made in the sampling procedures for the household survey. For the water quality testing, a much smaller sample of household taps was available for testing compared to initial expectations, so the eligibility for water quality tests was expanded at the beginning of these exercises to include shared sources in the community. Qualitative sampling was purposive and therefore was tailored to the specific objective of interviewing each type of respondent; while focus groups were initially planned to be a mix of males and females, after the first focus group the team decided to limit the participants to females only.
Response rates for the household survey in Dar es Salaam were above 87%, and above 92% in Morogoro. Overall, for the phone follow-up survey, response rates were 85% (round 1), 88% (round 2), and 90% (round 3); 81% of households overall participated in all 3 rounds while 90% participated in at least one round. Water quality test samples were drawn from household taps when available, or otherwise from other shared-source locations in the survey cluster (water quality results are intended to be representative at the cluster level). In Dar es Salaam, 95% of sampled clusters were covered by water quality tests, along with 99% of sampled clusters in Morogoro. Sampling for qualitative data collection components was purposive, in order to include specific types of respondents and target areas of each city with specific characteristics; this purposive sampling made extensive use of preliminary quantitative survey data and geospatial data. Qualitative research included, in total, 14 focus group discussions, 52 semi-structured interviews, and 10 key informant interviews.
Sample weights were applied to the baseline dataset to adjust for the cluster sampling design and, in Dar es Salaam, for stratification. The sampling weight is a number that indicates the number of households in the city overall that each household in the sample represents. Applying sampling weights allows the team to adjust for the sampling design and tabulate descriptive statistics that are representative of the entire city, as opposed to just the households included in the survey. Sample weights are equivalent to the inverse probability of being selected into the sample. The probability of a household's selection into the study is the product of two ratios. The first ratio iss 8 divided by the number of households in the cluster. The second ratio is 313 over the total number of clusters in the inventory for the stratum (or in the case of Morogoro, the number of clusters in the city since no stratification was done). In other words, it is the product of the probability of a cluster's selection and the probability of a household's selection. The sampling weight is 1/the probability of selection.
Dates of collection
Data collection supervision
Key field staff for data collection included a project manager, data manager/team leader, and field manager. In total, the field team was composed of two Field Coordinators, 12 listing supervisors, and 72 interviewers; field teams were split between the two cities. The Field Manager, who was responsible for overseeing all data collection teams in the field, was assisted by two Field Coordinators, who directly managed the survey and listing teams and collected the system-level data. This management team was based in Dar es Salaam and Morogoro for the entirety of data collection. During the data collection period, they oversaw the field teams (supervisors, enumerators, interviewers, laboratory staff) and coordinated all logistics, communicating any issues to higher management as needed. The Data Manager made several visits to the data collection sites throughout the preparation, piloting, and data collection implementation phases to oversee activities and address any outstanding issues. Listing teams worked closely with community representatives to develop the sample frame. Water quality staff included two laboratory supervisors and six laboratory technicians, split between the two cities. A data processing coordinator and quality control coordinator supported the survey teams. Qualitative interviews were conducted by two facilitators. Training was required for all field staff, including supervisors and HQ staff, which included classroom training followed by a full-scale pilot of listing, sampling, and interviewing. For qualitative interviews, two experienced qualitative research consultants were hired by EDI. During the survey, supervisors were responsible for reviewing the interviews conducted by their teams at the end of each day. This included reviewing each completed interview manually, as well as running the automated validation checks to ensure that all issues picked up by the Surveybe validation rules have been identified. Any errors or inconsistencies detected at this stage were to be addressed with interviewers and reconciled before being uploaded into the database. EDI's policy was to conduct a full re-interview if the supervisors detected substantial discrepancies. In addition, supervisors were to complete direct observations of interviews several times per week, and upload a weekly report summarizing the performance of the interviewers and any actions taken in response. Supervisors were also required to conduct re-visit spot-checks in three to six households per week to ensure data collection quality using a standardized set of thirteen questions selected from the household questionnaire. This process was used to monitor interviewers and to provide constructive guidance as needed. Supervisors were also responsible for conducting re-interviews of the entire questionnaire if any interviewers were suspected of fraudulent behavior; no re-interviewers were necessary over the course of data collection.
Economic Development Initiatives, Ltd.
EDI employed a data processing and quality control team, which was tasked with ensuring the quality of data collected through the Surveybe system. Daily checks of questionnaire data were conducted in Stata using a continually updated checking do-file, which flagged discrepancies and data inconsistencies. Each supervisor was provided a set of data checks to address with each team on a continuous basis. Data processing and quality control staff were primarily based at EDI headquarters, but were present in the field for several weeks during the beginning stages of each phase of data collection. This presence allowed them to participate in feedback sessions with interviewers demonstrating how data checks were conducted and how errors would be communicated to supervisors. The data processing team updated their checks periodically to accommodate new checks arising during the survey period, often in coordination with SI. SI's data quality monitoring strategy included continuous technical support to EDI over the entire period of data collection, field presence at all critical junctures during preparations for data collection, and several rounds of independent data verification with interim datasets provided throughout the survey period by EDI. SI wrote do files (using Stata 12) to monitor the quality of the data as it was received, updating them on a continuous basis and making adjustments after continuous communications with EDI's data manager and data processing teams, and ran through each of the datasets at numerous points between May and September 2013, communicating concerns or questions to EDI through a standard form used throughout the period of data collection. After the conclusion of the data collection, SI conducted a comprehensive data quality review of all datasets, inclusive of quantitative and qualitative datasets. SI submitted requests resulting from this review to EDI. EDI responded to these requests and subsequently delivered final datasets to SI via MCA-T. EDI produced several briefs on data quality assurance during data collection, and SI produced a data quality report for internal review by MCC.
EDI used its in-house electronic data collection system, called Surveybe, to administer the household survey data collection. Validation checks programmed into Surveybe display red flags on the screen to alert the enumerator when responses need attention. In this data collection effort, a total of 79 validation rules were programmed within the full baseline household survey and 57 in the phone survey. Interviewers were required to run the automatic validation checks once per screen, and again at the end of the interview. The interview would not be considered complete and would not be uploaded to the database if validation flags had not been addressed by interviewers. Interviewers also have the ability to input comments that provide explanation for seemingly inconsistent or outlier responses. All data from the Surveybe system were delivered to SI via MCA-T in Stata format. Final instruments were produced as PDF documents, exported from Surveybe, and included information on variable names, labels, response choices, validation rules, skip patterns, and the data table to which that data would export. Water quality test results were recorded on paper and subsequently transferred into a database uploaded through the Surveybe system. Photos of each water quality test slide read, with its own unique ID, were taken and provided to MCA-T and SI for QA purposes. System-level water quality tests were recorded directly into Microsoft Excel. Qualitative interviews were audio-recorded and transcribed directly into English in Microsoft Word; audio recordings were retained for QA purposes. Geospatial data collection by SI was transferred from hard copy to softcopy by SI's in-country GIS consultant and the scanned hard-copy documents were retained for QA purposes.
Sampling errors were calculated for all estimated quantities of indicators when tabulated to be representative of the city's population (i.e. when estimates are provided after sampling weights have been applied). The team calculated errors using the -svy- suite of commands in Stata 12. This module uses the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The baseline report contains a full account of the indicators estimated and the associated sampling errors. The report also contains a comprehensive account of all efforts taken by the evaluation team to avoid non-sampling errors (in measurement, sample frame coverage, and others).
Public use files
Rostapshova O, Roumis D, Alwang J on behalf of Social Impact, Inc.. 2014. Baseline Data Collection for Impact Evaluation of the MCC Tanzania Water Sector Project, Baseline Data 2013. Dataset: [Name of Dataset]. Date Accessed: [Date accessed]. Accessed at: [URL where data was accessed]. Distributed by Millennium Challenge Corporation: Washington, DC.
Version 1.0 (Original 2014-04-10): Original version of metadata
Version 2.0: Edited by Jen Sturdy, April 11 2014
Version 3.0: Submitted by Social Impact, August 15 2014
Version 4.0: Submitted by Social Impact, December 11 2014.
Changes made in Version 4.0 include:
Minor changes to the following sections: Program; Program participants; Evaluation summary; Unit of analysis; Geographic coverage of program; Sampling procedure; Deviations from Sample Design; Response Rates; Weighting; Estimates of Sampling Error; Data Collection Notes; Data Collection Team Composition; Data Cleaning; Data Entry Process; Access Conditions; Citation Requirement.
Version 5.0 (May 2015). Edited version based on Version 01 (DDI-MCC-TZA-WASH-IE-BL-SI-2014-v04.0) that was done by Millennium Challenge Corporation.