The 2003 Kenya Demographic and Health Survey (2003 KDHS) is a nationally representative sample survey of 8,195 women age 15 to 49 and 3,578 men age 15 to 54 selected from 400 sample points (clusters) throughout Kenya. It is designed to provide data to monitor the population and health situation in Kenya as a follow-up of the 1989, 1993 and 1998 KDHS surveys. The survey utilised a two-stage sample based on the 1999 Population and Housing Census and was designed to produce separate estimates for key indicators for each of the eight provinces in Kenya. Unlike prior KDHS surveys, the 2003 KDHS covered the northern half of Kenya. Data collection took place over a five-month period, from 18 April to 15 September 2003.
The 2003 Kenya Demographic and Health Survey (KDHS) is the latest in a series of national level population and health surveys to be carried out in Kenya in the last three decades. The 2003 KDHS is designed to provide data to monitor the population and health situation in Kenya and to be a follow-up to the 1989, 1993, and 1998 KDHS surveys.
The survey obtained detailed information on fertility levels; marriage; sexual activity; fertility preferences; awareness and use of family planning methods; breastfeeding practices; nutritional status of women and young children; childhood and maternal mortality; maternal and child health; and awareness and behaviour regarding HIV/AIDS and other sexually transmitted infections. New features of the 2003 KDHS include the collection of information on malaria and the use of mosquito nets, domestic violence, and HIV testing of adults.
More specifically, the objectives of the 2003 KDHS were to:
- At the national and provincial level, provide data that allow the derivation of demographic rates, particularly fertility and childhood mortality rates, which can be used to evaluate the achievements of the current national population policy for sustainable development;
- Measure changes in fertility and contraceptive prevalence use and at the same time study the factors that affect these changes, such as marriage patterns, desire for children, availability of contraception, breastfeeding habits, and important social and economic factors;
- Examine the basic indicators of maternal and child health in Kenya, including nutritional status, use of antenatal and maternity services, treatment of recent episodes of childhood illness, use of immunisation services, use of mosquito nets, and treatment of children and pregnant women for malaria;
- Describe the patterns of knowledge and behaviour related to the transmission of HIV/AIDS and other sexually transmitted infections;
- Estimate adult and maternal mortality ratios at the national level;
- Ascertain the extent and pattern of domestic violence and female genital cutting in the country;
- Estimate the prevalence of HIV in the country at the national and provincial level and use the data to corroborate the rates from the sentinel surveillance system.
Kind of data
Sample survey data
The 2003 KDHS was the first survey in the Demographic and Health Surveys (DHS) programme to cover the entire country, including North Eastern Province and other northern districts that had been excluded from the prior surveys (Turkana and Samburu in Rift Valley Province and Isiolo, Marsabit, and Moyale in Eastern Province).
Unit of analysis
- Women age 15-49
- Men age 15-54
- Children under five
All women age 15-49 years who were either usual residents of the households in the sample or visitors present in the household on the night before the survey were eligible to be interviewed in the survey. The survey collected information on demographic and health issues from a sample of women in the reproductive ages (15-49) and from men age 15-54 years in the one-in-two sub-sample of households selected for the male survey.
Producers and sponsors
Central Bureau of Statistics (CBS)
Ministry of Health
National Council for Population and Development.
Government of Kenya
States Agency for International Development
United Kingdom Department for International Development
United Nations Population Fund
Japan International Co-operation Agency
United Nations Development Programme
United Nations Children’s Fund
Centers for Disease Control and Prevention
the National AIDS and STIs Control programme (NASCOP)
Centers for Disease Control and Prevention
Kenya Medical Research Institute (KEMRI)
National Council of Population and Development (NCPD).
The sample for the 2003 KDHS covered the population residing in households in the country. A representative probability sample of almost 10,000 households was selected for the KDHS sample. This sample was constructed to allow for separate estimates for key indicators for each of the eight provinces in Kenya, as well as for urban and rural areas separately. Given the difficulties in traveling and interviewing in the sparsely populated and largely nomadic areas in the North Eastern Province, a smaller number of households was selected in this province. Urban areas were oversampled. As a result of these differing sample proportions, the KDHS sample is not self-weighting at the national level; consequently, all tables except those concerning response rates are based on weighted data.
The survey utilised a two-stage sample design. The first stage involved selecting sample points (“clusters”) from a national master sample maintained by CBS (the fourth National Sample Survey and Evaluation Programme [NASSEP IV]). The list of enumeration areas covered in the 1999 population census constituted the frame for the NASSEP IV sample selection and thus for the KDHS sample as well. A total of 400 clusters, 129 urban and 271 rural, were selected from the master frame. The second stage of selection involved the systematic sampling of households from a list of all households that had been prepared for NASSEP IV in 2002. The household listing was updated in May and June 2003 in 50 selected clusters in the largest cities because of the high rate of change in structures and household occupancy in the urban areas.
All women age 15-49 years who were either usual residents of the households in the sample or visitors present in the household on the night before the survey were eligible to be interviewed in the survey. In addition, in every second household selected for the survey, all men age 15-54 years were eligible to be interviewed if they were either permanent residents or visitors present in the household on the night before the survey. All women and men living in the households selected for the Men's Questionnaire and eligible for the individual interview were asked to voluntarily give a few drops of blood for HIV testing.
A total of 9,865 households were selected in the sample, of which 8,889 were occupied and therefore eligible for interviews. The shortfall was largely due to structures that were found to be vacant or destroyed. Of the 8,889 existing households, 8,561 were successfully interviewed, yielding a household response rate of 96 percent.
In the households interviewed in the survey, 8,717 eligible women were identified; interviews were completed with 8,195 of these women, yielding a response rate of 94 percent. With regard to the male survey results, 4,183 eligible men were identified in the subsample of households selected for the male survey, of whom 3,578 were successfully interviewed, yielding a response rate of 86 percent. The response rates are higher in rural areas, as compared with urban areas both for males and females.
The principal reason for nonresponse among both eligible men and women was the failure to find individuals despite repeated visits to the household and even sometimes the work place. The substantially lower response rate for men reflects the more frequent and longer absences of men from the household.
Response rates for the HIV testing component were lower than those for the interviews.
Dates of collection
Mode of data collection
Three questionnaires were used in the survey:a) the Household Questionnaire, b) the Women's Questionnaire and c) the Men's Questionnaire. The contents of these questionnaires were based on model questionnaires developed by the MEASURE DHS+ programme.
In consultation with a broad spectrum of technical institutions, government agencies, and local and international organisations, CBS modified the DHS model questionnaires to reflect relevant issues in population, family planning, HIV/AIDS, and other health issues in Kenya. A number of thematic questionnaire design committees were organised by CBS. Periodic meetings of each of the thematic committees, as well as the final meeting, were also arranged by CBS. The inputs generated in these meetings were used to finalise survey questionnaires. These questionnaires were then translated from English into Kiswahili and 11 other local languages (Embu, Kalenjin, Kamba, Kikuyu, Kisii, Luhya, Luo, Maasai, Meru, Mijikenda, and Somali). The questionnaires were further refined after the pretest and training of the field staff.
a) The Household Questionnaire was used to list all of the usual members and visitors in the selected households. Some basic information was collected on the characteristics of each person listed, including age, sex, education, and relationship to the head of the household. The main purpose of the Household Questionnaire was to identify women and men who were eligible for the individual interview. The Household Questionnaire also collected information on characteristics of the household's dwelling unit, such as the source of water, type of toilet facilities, materials used for the floor and roof of the house, ownership of various durable goods, and ownership and use of mosquito nets. In addition, this questionnaire was used to record height and weight measurements of women age 15-49 years and children under the age of 5 years, households eligible for collection of blood samples, and the respondents' consent to voluntarily give blood samples. The HIV testing procedures are described in detail in the next section.
b) The Women's Questionnaire was used to collect information from all women age 15-49 years and covered the following topics:
- Background characteristics (e.g., education, residential history, media exposure)
- Reproductive history
- Knowledge and use of family planning methods
- Fertility preferences
- Antenatal and delivery care
Vaccinations and childhood illnesses
- Marriage and sexual activity
- Woman's work and husband's background characteristics
- Infant and child feeding practices
- Childhood mortality
- Awareness and behaviour about AIDS and other sexually transmitted diseases
- Adult mortality including maternal mortality.
The Women's Questionnaire also included a series of questions to obtain information on women's experience of domestic violence. These questions were administered to one woman per household. In households with two or more eligible women, special procedures were followed, which ensured that there was random selection of the woman to be interviewed.
c) The Men's Questionnaire was administered to all men age 15-54 years living in every second household in the sample. The Men's Questionnaire collected similar information contained in the Women's Questionnaire, but was shorter because it did not contain questions on reproductive history, maternal and child health, nutrition, maternal mortality, and domestic violence.
All aspects of the KDHS data collection were pretested in November and December 2002. Thirteen teams (one for each language) were formed, each with one female interviewer, one male interviewer, and one health worker. The 39 team members were trained for two weeks and then proceeded to conduct interviews in the various districts in which their language was spoken. In total, 260 households were covered in the pretest. The lessons learnt from the pretest were used to finalise the survey instruments and logistical arrangements for the survey. The pretest underscored the desirability of inluding voluntary counselling and testing (VCT) for HIV/AIDS as an integral part of the survey, since many respondents during the pretest wanted to know their HIV status.
Central Bureau of Statistics (CBS)
The processing of the 2003 KDHS results began shortly after the fieldwork commenced. Completed questionnaires were returned periodically from the field to CBS offices in Nairobi, where they were edited and entered by data processing personnel specially trained for this task. Data were entered using CSPro. All data were entered twice (100 percent verification). The concurrent processing of the data was a distinct advantage for data quality, since CBS was able to advise field teams of errors detected during data entry. The data entry and editing phase of the survey was completed in October 2003.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2003 NDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the 2003 NDHS is the ISSA Sampling Error Module. This module used the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
The Jackknife repeated replications of the parent sample, and calculates standard errors for these estimates using simple formulae. Each replication considers all but one clusters in the calculation of the estimates. Pseudo-independent replications are thus created. In the 2003 NDHS, there were 362 non-empty clusters. Hence, 361 replications were created.
In addition to the standard error, ISSA computes the design effect (DEFT) for each estimate, which is defined as the ratio between the standard error using the given sample design and the standard error that would result if a simple random sample had been used. A DEFT value of 1.0 indicates that the sample design is as efficient as a simple random sample, while a value greater than 1.0 indicates the increase in the sampling error due to the use of a more complex and less statistically efficient design. ISSA also computes the relative error and confidence limits for the estimates.
Sampling errors for the 2003 NDHS are calculated for selected variables considered to be of primary interest for woman's survey and for man's surveys, respectively. The results are presented in an appendix of the Final Report for the country as a whole, for urban and rural areas, and for each of the 6 regions. For each variable, the type of statistic (mean, proportion, or rate) and the base population are given in Table B.1 of the appendix of the Final Report. Tables B.2 to B.10 present the value of the statistic (R), its standard error (SE), the number of unweighted (N) and weighted (WN) cases, the design effect (DEFT), the relative standard error (SE/R), and the 95 percent confidence limits (R±2SE), for each variable. The DEFT is considered undefined when the standard error considering simple random sample is zero (when the estimate is close to 0 or 1). In the case of the total fertility rate, the number of unweighted cases is not relevant, as there is no known unweighted value for woman-years of exposure to childbearing.
The confidence interval (e.g., as calculated for children ever born to women aged 40-49) can be interpreted as follows: the overall average from the national sample is 6.808 and its standard error is 0.134. Therefore, to obtain the 95 percent confidence limits, one adds and subtracts twice the standard error to the sample estimate, i.e., 6.808±2×0.134. There is a high probability (95 percent) that the true average number of children ever born to all women aged 40 to 49 is between 6.540 and 7.077.
Sampling errors are analyzed for the national woman sample and for two separate groups of estimates: (1) means and proportions, and (2) complex demographic rates. The relative standard errors (SE/R) for the means and proportions range between 1.1 percent and 32.7 percent with an average of 6.36 percent; the highest relative standard errors are for estimates of very low values (e.g., currently using female sterilization). If estimates of very low values (less than 10 percent) were removed, then the average drops to 4.2 percent. So in general, the relative standard error for most estimates for the country as a whole is small, except for estimates of very small proportions. The relative standard error for the total fertility rate is small, 2.5 percent. However, for the mortality rates, the average relative standard error is much higher, 6.04 percent.
There are differentials in the relative standard error for the estimates of sub-populations. For example, for the variable want no more children, the relative standard errors as a percent of the estimated mean for the whole country, and for the urban areas are 4.9 percent and 6.1 percent, respectively.
For the total sample, the value of the design effect (DEFT), averaged over all variables, is 1.78 which means that, due to multi-stage clustering of the sample, the average standard error is increased by a factor of 1.78 over that in an equivalent simple random sample.
Other forms of data appraisal
Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2003 Kenya Demographic and Health Survey (NDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.