The objective of the survey was to produce baselines for 15 large urban centers in Kenya. The urban centers covered Nairobi, Mombasa, Naivasha, Nakuru, Malindi, Eldoret, Garissa, Embu, Kitui, Kericho, Thika, Kakamega, Kisumu, Machakos, and Nyeri. The survey covered the following issues: (a) household characteristics; (b) household economic profile; (c) housing, tenure, and rents; and (d) infrastructure services. The survey was undertaken to deepen understanding of the cities’ growth dynamics, and to identify specific challenges to quality of life for residents. The survey pays special attention to living conditions for residents of formal versus informal settlements, poor versus non-poor, and male and female headed households.
Kind of data
Sample survey data [ssd]
- v2.1: Edited, anonymous dataset for public distribution.
Unit of analysis
Producers and sponsors
Kenya Ministry of Local Government
Kenya National Bureau of Statistics
Bill and Melinda Gates Foundation
Swedish Internation Development Cooperation Agency
The Kenya State of the Cities Baseline Survey is aimed to produce reliable estimates of key indicators related to demographic profile, infrastructure access and economic profile for each of the 15 towns and cities based on representative samples, including representative samples of households (HHs) residing in slum and non-slum areas. For this baseline household survey, NORC used a two- or three-stage stratified cluster sampling design within each of the 15 urban centers. Our first-stage sampling frame was based on the 2009 census frame of enumeration areas. For each of the 15 towns and cities, NORC received the sampling frame of EAs from the Kenya National Bureau of Statistics (KNBS). In the first stage, NORC selected a sample of enumeration areas (PSUs). The second stage involved a random selection of households (SSUs) from each selected EA. In order to manage the field interviewing efficiently, we drew a fixed number of HHs from each selected EA, irrespective of EA size. The third stage arose in instances of very large EAs (EAs containing more than 200 households) in which EAs were divided into 2, 3 or 4 segments, from which one segment was selected randomly for household selection.
Stratification of Enumeration Areas: A few stratification factors were available for stratifying the EAs to help to achieve the survey objectives. As mentioned earlier, for this baseline survey we wanted to draw representative samples from slum and non-slum areas and also to include poor/non-poor households (HHs). For the 2009 census, depending on the location, KNBS divided the EAs into three categories: rural, urban, and peri-urban.
Although there is a clear distinction of EAs into slum and non-slum areas, it is hard to classify EAs into poor and non-poor categories. To guarantee enough representation of HHs living in slum and non-slum areas (also referred to as formal and informal areas) as well as HHs living below and above the poverty line, NORC stratified the first-stage sampling units (EAs) into strata, based on EA type (3 types) and settlement type (2 types). Given the resources available, we believe this stratification would serve our purpose as HHs living in slum and in rural areas tend to be poor. Table 1 in Appendix C of final Overview Report (provided under the Related Materials tab) presents the allocation of sampled EAs across the strata for each of the 15 cities in the baseline survey.
Sampling households is not as straightforward as the first-stage sampling of EAs, since the 2009 census frame of HHs does not exist. In the absence of a household sampling frame, NORC carried out a listing of HHs within each EA selected in the first stage. Trained listers, accompanied by local cluster guides (local residents with some form of authority in the EA), systematically listed all households in each selected EA, gathering the address, names of head of household and spouse, household description, latitude and longitude. To ensure completeness of listing data, avoid duplication and improve ease of locating households that were eventually selected for interview, listers enumerated households by chalking household identification number above the household doorway (an accepted practice for national surveys). The sampling frame of HHs produced from the listing activity was, therefore, up-to-date and included new formal and informal settlements that appeared after the 2009 census.
For adequate representativeness and to manage the interviewing task efficiently, NORC planned seven completed household interviews per EA. The final recommended sample size for the Kenya State of the Cities baseline survey is found in Table 2 in Appendix C of the final Overview Report.
Because the expected response rate was unknown prior to the start of the field period, the sampling team randomly selected ten households per enumeration area and distributed them to the interviewers working within the EA. Interviewing teams were instructed to complete at least seven interviews per EA from among the ten selected households. Interviewers were instructed to attempt at least three contacts with each selected household, approaching potential respondents on different days of the week and different times of day. Table 2 presents the final number of EAs listed per city and the final number of completed interviews per city. The
table also presents the percent of planned EAs and interviews that were completed vs. planned. Please note that in several cities more interviews were completed than planned. As part of NORC's data quality plan, data collection teams were instructed to overshoot slightly the target of seven interviews per EA, if feasible, to
mitigate any potential loss of cases due to poor quality or uncooperative respondents. Few cases were lost due to poor quality, therefore the target number of interviews remains over 100 percent in ten of the fifteen cities.
The completion rate is reported as the number of households that successfully completed an interview over the total number of households selected for the EA. These are shown by city in Table 5 in Appendix C of the final Overview Report, and have an average rate of 68.66 percent, with variation from 66 to 74 percent (aside from Nairobi at 61.47 percent and Machakos at 56 percent). As described earlier, ten households were selected per EA if the EA contained more than 10 households. For EAs where fewer than ten households were selected for interviews, all households were selected. In some EAs, more than ten households were selected due to a central office error.
Dates of collection
Mode of data collection
Data collection supervision
Staffing the large scale data collection was a crucial factor in establishing high quality data. Supervisors and interviewers were recruited by IRC using guidelines developed by NORC, which emphasized CAPI experience, face-to-face interviewing experience, the ability to gain cooperation and a commitment to data quality. Interviewers were grouped into eight teams of 6-8 interviewers, each of which was led by an IRC supervisor with experience managing complex face-to-face social scientific surveys Training for the data collection team took place in three phases. In the first phase, supervisors were recruited,with particular care taken to include supervisors from ethnic and linguistic groups represented among the15 cities. Supervisors participated in a five-day pretesting activity that included 2.5 days of classroom and small group training to become familiar with the tablet computers and programmed questionnaire, followed by two days of pretesting among a convenience sample of respondents in informal settlements in Nairobi.
The second phase of training included a one-day Training of Trainers (ToT) and two days of Supervisor training, including detailed instruction on carrying out listing and sampling, gaining cooperation among respondents, coaching interviewers, reporting and ensuring quality control, confidentiality and security. Eight supervisors attended the ToT and Supervisor training.
The third phase of training included five days of classroom and small group activities for the 58 interviewers brought to training followed by two days of piloting among a convenience sample in informal areas of Nairobi. All interviewers were required to pass a practical exam using the tablet questionnaire and to successfully demonstrate all listing and interviewing tasks during the two day pilot. After training, three interviewers were dismissed from the data collection.
The questionnaire was developed by World Bank staff with input from stakeholders in the Kenya Municipal Program and NORC researchers and survey methodologists. The base questionnaire for the project was a 2004 World Bank survey of Nairobi slums. However, an extended iterative review process led to many changes in the questionnaire. The final version that was used for programming provided under the Related Materials tab, and in Volume II of the Overview.
The questionnaire’s topical coverage is indicated by the titles of its nine modules:
1. Demographics and household composition
2. Security of housing, land and tenure
3. Housing and settlement profile
4. Economic profile
5. Infrastructure services
7. Household enterprises7
8. Civil participation and respondent tracking
Before being granted access to the dataset, all users have to formally agree:
1. To make no copies of any files or portions of files to which s/he is granted access except those authorized by the data depositor.
2. Not to use any technique in an attempt to learn the identity of any person, establishment, or sampling unit not identified on public use data files.
3. To hold in strictest confidence the identification of any establishment or individual that may be inadvertently revealed in any documents or discussion, or analysis. Such inadvertent identification revealed in her/his analysis will be immediately brought to the attention of the data depositor.
- Public use files, accessible to all
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Example: Gulyani Sumila, Wendy Ayres, Ray Struyk and Clifford Zinnes.2012. Kenya State of the Cities Baseline Survey 2012-2013. Ref: KEN_2012_SOCBL_v01_M. World Bank & NORC. Downloaded from [URL] on [Date]
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.