The primary objective of the South Africa Demographic and Health Survey (SADHS) 2016 is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the SADHS 2016 collected information on fertility levels; marriage; sexual activity; fertility preferences; awareness and use of contraceptives; breastfeeding practices; nutrition; childhood and maternal mortality; maternal health, including antenatal and postnatal care; key aspects of child health, including immunisation coverage and prevalence and treatment of acute respiratory infection (ARI), fever, and diarrhoea; potential exposure to the risk of HIV infection; coverage of HIV counselling and testing (HCT); and physical and sexual violence against women. Another critical objective of the SADHS 2016 is to provide estimates of health and behaviour indicators for adults age 15 and older, including use of tobacco, alcohol, and codeine-containing medications. In addition, the SADHS 2016 provides estimates of the prevalence of anaemia among children age 6-59 months and adults age 15 and older, and the prevalence of hypertension, anaemia, high HbA1c levels (an indicator of diabetes), and HIV among adults age 15 and older.
The information collected through the SADHS 2016 is intended to assist policymakers and programme managers in evaluating and designing programmes and strategies for improving the health of the country’s population.
Kind of data
Sample survey data [ssd]
The data dictionary was generated from hierarchical data that was downloaded from the The DHS Program website (http://dhsprogram.com).
The survey was designed to provide representative estimates for the country as a whole, for urban and non-urban areas separately, and for each of the nine provinces in South Africa: Western Cape, Eastern Cape, Northern Cape, Free State, KwaZulu-Natal, North West, Gauteng, Mpumalanga, and Limpopo.
Unit of analysis
- Children age 0-5
- Woman age 15-49
- Man age 15-59
The survey covered all de jure household members (usual residents), children age 0-5 years, women age 15-49 years and men age 15-59 years resident in the household.
Producers and sponsors
Statistics South Africa (Stats SA)
Government of South Africa
South African Medical Research Council
The DHS Program
Provided technical assistance through The DHS Program
Government of South Africa
Global Fund to Fight AIDS, Tuberculosis and Malaria (Global Fund)
United Nations Children’s Fund
United Nations Population Fund
United States Agency for International Development
The sampling frame used for the SADHS 2016 is the Statistics South Africa Master Sample Frame (MSF), which was created using Census 2011 enumeration areas (EAs). In the MSF, EAs of manageable size were treated as primary sampling units (PSUs), whereas small neighbouring EAs were pooled together to form new PSUs, and large EAs were split into conceptual PSUs. The frame contains information about the geographic type (urban, traditional, or farm) and the estimated number of residential dwelling units (DUs) in each PSU. The sampling convention used by Stats SA is DUs. One or more households may be located in any given DU; recent surveys have found 1.03 households per DU on average.
Administratively, South Africa is divided into nine provinces. The sample for the SADHS 2016 was designed to provide estimates of key indicators for the country as a whole, for urban and non-urban areas separately, and for each of the nine provinces in South Africa. To ensure that the survey precision is comparable across provinces, PSUs were allocated by a power allocation rather than a proportional allocation. Each province was stratified into urban, farm, and traditional areas, yielding 26 sampling strata.
The SADHS 2016 followed a stratified two-stage sample design with a probability proportional to size sampling of PSUs at the first stage and systematic sampling of DUs at the second stage. The Census 2011 DU count was used as the PSU measure of size. A total of 750 PSUs were selected from the 26 sampling strata, yielding 468 selected PSUs in urban areas, 224 PSUs in traditional areas, and 58 PSUs in farm areas.
For further details on sample design, see Appendix A of the final report.
A total of 15,292 households were selected for the sample, of which 13,288 were occupied. Of the occupied households, 11,083 were successfully interviewed, yielding a response rate of 83%.
In the interviewed households, 9,878 eligible women age 15-49 were identified for individual interviews; interviews were completed with 8,514 women, yielding a response rate of 86%. In the subsample of households selected for the male survey, 4,952 eligible men age 15-59 were identified and 3,618 were successfully interviewed, yielding a response rate of 73%. In this same subsample, 12,717 eligible adults age 15 and older were identified and 10,336 were successfully interviewed with the adult health module, yielding a response rate of 81%. Response rates were consistently lower in urban areas than in nonurban areas.
Design weights were adjusted for household nonresponse and individual nonresponse to obtain the sampling weights for households and for women age 15-49 and men age 15-59, respectively. The nonresponse adjustment was done using stratumlevel adjustment factors. The differences of the household sampling weight and the individual sampling weights are introduced by individual nonresponse. For the household sampling weight, the household design weight is multiplied by the inverse of the household response rate by stratum. For the women’s individual sampling weight, the household sampling weight is multiplied by the inverse of the women’s individual response rate by stratum. Finally, for the men’s individual sampling weight, the household sampling weight for the male subsample is multiplied by the inverse of the men’s individual response rate by stratum.
In addition to the standard weights for women age 15-49 and men age 15-59, separate weights were calculated for the adult health module that accounted for nonresponse among women age 15 and older and men age 15 and older. Moreover, a special weight was calculated for the domestic violence module to account for within-household selection and for nonresponse to the module. Special weights were also calculated for HIV and HbA1c tests to account for nonresponse with respect to these tests. The final sampling weights are normalised in order to give a total number of weighted cases that equals the total number of unweighted cases at the national level. Normalisation is done by multiplying the sampling weight by the estimated total sampling fraction obtained from the survey for the household weight, the individual woman’s weight, the individual man’s weight, and the other weights mentioned above except for the sampling weights for HIV testing. In the case of the latter, the weights are normalised at the national level for women and men together so that HIV prevalence estimates calculated for women and men together are valid. The normalised weights are relative weights that are valid for estimating means, proportions, and ratios but not valid for estimating population totals or pooled data.
For further details on sampling weights, see Appendix A.4 of the final report.
Dates of collection
Mode of data collection
Five questionnaires were used in the SADHS 2016: the Household Questionnaire, the individual Woman’s Questionnaire, the individual Man’s Questionnaire, the Caregiver’s Questionnaire, and the Biomarker Questionnaire. These questionnaires, based on The DHS Program’s standard Demographic and Health Survey questionnaires, were adapted to reflect the population and health issues relevant to South Africa. Input was solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. After the preparation of the questionnaires in English, the questionnaires were translated into South Africa’s 10 other official languages. In addition, information about the fieldworkers for the survey was collected through a self-administered Fieldworker Questionnaire.
Statistics South Africa
Government of South Africa
All electronic data files for the SADHS 2016 were transferred via the IFSS to the Stats SA head office in Pretoria, where they were stored on a password-protected computer. The data processing operation included secondary editing, which required resolution of computer-identified inconsistencies and coding of open-ended questions. The data were processed by a core group of four people; secondary editing was completed by 11 people. All persons involved in data processing took part in the main fieldwork training, and they were supervised by senior staff from Stats SA with support from ICF. Data editing was accomplished using CSPro software. Secondary editing was initiated in October 2016 and completed in February 2017. Checking inconsistencies in dates of immunisations was aided by the digital images of the immunisation page of the Road-to-Health booklet that had been collected on the tablet by fieldworkers at the time of the interview for that purpose.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the SADHS 2016 to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the SADHS 2016 is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the SADHS 2016 sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS, using programs developed by ICF. These programs use the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in Appendix B of the survey final report.
Other forms of data appraisal
Data Quality Tables
- Household age distribution
- Age distribution of eligible and interviewed women
- Age distribution of eligible and interviewed men
- Completeness of reporting
- Births by calendar years
- Reporting of age at death in days
- Reporting of age at death in months
- Height and weight data completeness and quality for children
- Completeness of information on siblings
- Sibship size and sex ratio of siblings
See details of the data quality tables in Appendix C of the survey final report.
The DHS Program
Request Dataset Access
The following applies to DHS, MIS, AIS and SPA survey datasets (Surveys, GPS, and HIV).
To request dataset access, you must first be a registered user of the website. You must then create a new research project request. The request must include a project title and a description of the analysis you propose to perform with the data.
The requested data should only be used for the purpose of the research or study. To request the same or different data for another purpose, a new research project request should be submitted. The DHS Program will normally review all data requests within 24 hours (Monday - Friday) and provide notification if access has been granted or additional project information is needed before access can be granted.
DATASET ACCESS APPROVAL PROCESS
Access to DHS, MIS, AIS and SPA survey datasets (Surveys, HIV, and GPS) is requested and granted by country. This means that when approved, full access is granted to all unrestricted survey datasets for that country. Access to HIV and GIS datasets requires an online acknowledgment of the conditions of use.
A dataset request must include contact information, a research project title, and a description of the analysis you propose to perform with the data.
A few datasets are restricted and these are noted. Access to restricted datasets is requested online as with other datasets. An additional consent form is required for some datasets, and the form will be emailed to you upon authorization of your account. For other restricted surveys, permission must be granted by the appropriate implementing organizations, before The DHS Program can grant access. You will be emailed the information for contacting the implementing organizations. A few restricted surveys are authorized directly within The DHS Program, upon receipt of an email request.
When The DHS Program receives authorization from the appropriate organizations, the user will be contacted, and the datasets made available by secure FTP.
GPS/HIV Datasets/Other Biomarkers
Once downloaded, the datasets must not be passed on to other researchers without the written consent of The DHS Program. All reports and publications based on the requested data must be sent to The DHS Program Data Archive in a Portable Document Format (pdf) or a printed hard copy.
Datasets are made available for download by survey. You will be presented with a list of surveys for which you have been granted dataset access. After selecting a survey, a list of all available datasets for that survey will be displayed, including all survey, GPS, and HIV data files. However, only data types for which you have been granted access will be accessible. To download, simply click on the files that you wish to download and a "File Download" prompt will guide you through the remaining steps.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.