Service Delivery Indicators Kenya Education Survey 2012 - Harmonized Public Use Data
This survey is part of the Service Delivery Indicators (SDI) project, an initiative led by the World Bank. SDI surveys track the quality of service delivery in primary schools and frontline health facilities across Africa and other regions of the world. The indicators can be used to track progress within and across countries over time and aim to enhance the active monitoring of service delivery to increase public accountability and good governance. Ultimately, the goal of the program is to support policymakers, citizens, service providers, donors, and other stakeholders in enhancing the quality of service delivery and improve development outcomes.
Since the inception of the initiative in 2010, twenty-six surveys have been completed in twelve countries in Africa, capturing the health and primary education service delivery experience of over 500 million people. The surveys have been extended in Africa (with SDIs in the pipeline or envisioned in Benin, DRC, Cameron, Madagascar, Ethiopia, Cote D’Ivoire, Gambia, Guinea Bissau, Malawi, and Mali) and beyond: data collection for the first SDI surveys in Latin America (Guatemala), East Asia/Pacific (Indonesia), Europe and Central Asia (Armenia, Moldova, and Ukraine), Middle East and North Africa (Iraq), and South Asia (Bhutan) are underway or being considered.
With the exception of Module 3 (school finances), the data distributed here were harmonized to a common standard to facilitate comparisons across countries and over time. The data was also anonymized to preserve the confidentiality of the data of respondents. The harmonization and anonymization work was conducted by the SDI team at the World Bank.
SDI surveys are documented in the Microdata Library as Service Delivery Indicators Health Surveys and Service Delivery Indicators Education Surveys.
The Service Delivery Indicators (SDI) are a set of health and education indicators that examine the effort and ability of staff and the availability of key inputs and resources that contribute to a functioning school or health facility. The indicators are standardized allowing comparison between and within countries over time.
The Education SDIs include teacher effort, teacher knowledge and ability, and the availability of key inputs (for example, textbooks, basic teaching equipment, and infrastructure such as blackboards and toilets). The indicators provide a snapshot of the learning environment and the key resources necessary for students to learn.
Kenya's Service Delivery Indicators Education Survey was implemented in May-July 2012 by the Economic Policy Research Center and Kimetrica, in close coordination with the World Bank SDI team. The data were collected from a stratified random sample of 239 public and 67 private schools to provide a representative snapshot of the learning environment in both public and private schools. The survey assessed the knowledge of 1,679 primary school teachers, surveyed 2,960 teachers for an absenteeism study, and observed 306 grade 4 lessons. In addition, learning outcomes were measured for almost 3,000 grade 4 students.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Schools, teachers, students.
v02, harmonized, and anonymized data for public distribution.
Module 3 (School Management/Finances) was added to the collection. It is important to note that module 3’s variables were not harmonized to a common standard. Given different educational (and administrative) structures in each SDI country, questions for module 3 are uniquely designed according to the country’s interests and context. Nonetheless, the dataset was still subject to a thorough anonymization process like the rest of the datasets added to the collection.
Compared to v01, v02 changes include:
1. Data reorganized by the unit of analysis (school, teacher, pupil).
2. New round of data cleaning (renaming and relabeling variables, fixing incorrect value labels, fixing and recategorizing missing observations, deleting analysis/supervision variables, reordering variables to reflect instrument structure, fixing incorrect category values).
3. Addition of anonymized variables from module 3. Module 3 was not initially included in the previous version as the context-related uniqueness of school financial information collected in each country does not allow for harmonization.
4. New round of data harmonization (with the exception of module 3) and anonymization.
For more details on the anonymization process, please refer to the Anonymization Protocol included as part of this collection’s documentation.
The core Education Service Delivery Indicators are:
1) Teacher Effort:
- School absence rate
- Classroom absence rate
- Time spent teaching per day
2) Teacher Knowledge and Ability:
- Minimum knowledge of mathematics
- Minimum knowledge of English
- Minimum knowledge of pedagogy
3) Availability of Inputs:
- Minimum infrastructure availability
- Minimum equipment availability
- Share of pupils with textbooks
- Observed pupil-teacher ratio
The survey also tests grade 4 pupils on mathematics, linguistics, and non-verbal reasoning. Additionally, direct observations are made to assess firsthand classroom conditions as well as school inputs and infrastructure that teachers and students have readily available.
SDI instruments also include modules on the school's finances and staff/teacher rosters that include their qualifications and demographic information.
All primary schools
Producers and sponsors
The World Bank
Economic Policy Research Center
William and Flora Hewlett Foundation
The World Bank
The sampling strategy for SDI surveys is designed towards attaining indicators that are accurate and representative at the national level, as this allows for proper cross-country (i.e. international benchmarking) and across time comparisons, when applicable. In addition, other levels of representativeness are sought to allow for further disaggregation (rural/urban areas, public/private facilities, subregions, etc.) during the analysis stage.
The sampling strategy for SDI surveys follows a multistage sampling approach. The main units of analysis are facilities (schools and health centers) and providers (health and education workers: teachers, doctors, nurses, facility managers, etc.). In the case of education, SDI surveys also aim to produce accurate information on grade four pupils’ performance through a student assessment. The multistage sampling approach makes sampling procedures more practical by dividing the selection of large populations of sampling units in a step-by-step fashion. After defining the sampling frame and categorizing it by stratum, a first stage selection of sampling units is carried out independently within each stratum. Often, the primary sampling units (PSU) for this stage are cluster locations (e.g. districts, communities, counties, neighborhoods, etc.) which are randomly drawn within each stratum with a probability proportional to the size (PPS) of the cluster (measured by the location’s number of facilities, providers or pupils). Once locations are selected, a second stage takes place by randomly selecting facilities within location (either with equal probability or with PPS) as secondary sampling units. At a third stage, a fixed number of health and education workers and pupils are randomly selected within facilities to provide information for the different questionnaire modules.
Detailed information about the specific sampling process conducted for the 2012 Kenya Education SDI is available in the SDI Country Report (“SDI-Report-Kenya”) included as part of the documentation that accompanies these datasets.
SDI survey estimates must be properly weighted using a sampling weight or expansion factor to assure representativeness of the population of interest.
School-level sampling weights are stored in the "KEN_2012_EDU_Weights.dta" file.
Dates of Data Collection
Data Collection Mode
The SDI Education Survey Questionnaire consists of six modules:
Module 1: School Information - Administered to the head of the school to collect information on school type, facilities, school governance, pupil numbers, and school hours. It includes direct observations of school infrastructure by enumerators.
Module 2a: Teacher Absence and Information - Administered to the headteacher and individual teachers to obtain a list of all school teachers, to measure teacher absence, and to collect information on teacher characteristics.
Module 2b: Teacher Absence and Information - Unannounced visit to the school to assess the absence rate.
Module 3: School Finances - Administered to the headteacher to collect information on school finances (this data is unharmonized)
Module 4: Classroom Observation - An observation module to assess teaching activities and classroom conditions.
Module 5: Pupil Assessment - A test of pupils to have a measure of pupil learning outcomes in mathematics and language in grade four. The test is carried out orally and one-on-one with each student by the enumerator.
Module 6: Teacher Assessment - A test of teachers covering mathematics and language subject knowledge and teaching skills.
Data entry was done using CSPro; quality control was performed in Stata.
Datasets have been cleaned, anonymized, and harmonized using Stata.
For some variables and observations, missings were recoded and/or distinctly labeled as:
Skips: "-999", or ".a"
Don't Know: "999", "99", or ".b"
Confidential Information (Anonymized): "Confidential", or ".c "
Not Available: ".n"
Estimates of Sampling Error
At the national level, an anticipated standard error of 1.6 percentage points for absenteeism, and 4.4 percentage points for pupil literacy were calculated. At the county level, an anticipated standard error of 3.1 percent for absenteeism and 9.0 percent for literacy were estimated.
HD Practice Group, The World Bank
Details about the anonymization process can be found in the documentation provided with these datasets.
The harmonized, anonymized datasets are available as public use files.
Researchers who feel that they need non-anonymized data should contact firstname.lastname@example.org with a statement of research objectives and a rationale for why they require such data. That will start the research use file discussion.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Waly Wane, The World Bank. Kenya Service Delivery Indicators Education Survey (SDI-E) 2012 - Harmonized Public Use Data, Ref. KEN_2012_SDI-E_v01_M_v02_A_PUF. Dataset downloaded from [URL] on [date].
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
DDI Document ID
Development Data Group
The World Bank
Date of Metadata Production
DDI Document version
Version 02 (February 2020). This version differs from the original version as it includes:
1. Data reorganized by unit of analysis (school, teacher, pupil, time-on-task).
2. New round of data cleaning.
3. New round of data harmonization and anonymization.
4. Addition of anonymized variables from module 3 on School Management/Finances.
The metadata and documentation included in this version reflect and explain these changes.