The Pakistan Demographic and Health Survey PDHS 2017-18 was the fourth of its kind in Pakistan, following the 1990-91, 2006-07, and 2012-13 PDHS surveys.
The primary objective of the 2017-18 PDHS is to provide up-to-date estimates of basic demographic and health indicators. The PDHS provides a comprehensive overview of population, maternal, and child health issues in Pakistan. Specifically, the 2017-18 PDHS collected information on:
- Key demographic indicators, particularly fertility and under-5 mortality rates, at the national level, for urban and rural areas, and within the country’s eight regions
- Direct and indirect factors that determine levels and trends of fertility and child mortality
- Contraceptive knowledge and practice
- Maternal health and care including antenatal, perinatal, and postnatal care
- Child feeding practices, including breastfeeding, and anthropometric measures to assess the nutritional status of children under age 5 and women age 15-49
- Key aspects of family health, including vaccination coverage and prevalence of diseases among infants and children under age 5
- Knowledge and attitudes of women and men about sexually transmitted infections (STIs), including HIV/AIDS, and potential exposure to risk
- Women's empowerment and its relationship to reproductive health and family planning
- Disability level
- Extent of gender-based violence
- Migration patterns
The information collected through the 2017-18 PDHS is intended to assist policymakers and program managers at the federal and provincial government levels, in the private sector, and at international organisations in evaluating and designing programs and strategies for improving the health of the country’s population. The data also provides information on indicators relevant to the Sustainable Development Goals.
Kind of data
Sample survey data [ssd]
The data dictionary was generated from hierarchical data that was downloaded from the The DHS Program website (http://dhsprogram.com).
Unit of analysis
- Children age 0-5
- Woman age 15-49
- Man age 15-49
The survey covered all de jure household members (usual residents), children age 0-5 years, women age 15-49 years and men age 15-49 years resident in the household.
Producers and sponsors
National Institute of Population Studies (NIPS)
Government of Pakistan
The DHS Program
Provided technical assistance through The DHS Program
Pakistan Bureau of Statistics
The DHS Program
Provided technical support
Government of Pakistan
United States Agency for International Development
Department for International Development
United Nations Population Fund
The sampling frame used for the 2017-18 PDHS is a complete list of enumeration blocks (EBs) created for the Pakistan Population and Housing Census 2017, which was conducted from March to May 2017. The Pakistan Bureau of Statistics (PBS) supported the sample design of the survey and worked in close coordination with NIPS. The 2017-18 PDHS represents the population of Pakistan including Azad Jammu and Kashmir (AJK) and the former Federally Administrated Tribal Areas (FATA), which were not included in the 2012-13 PDHS. The results of the 2017-18 PDHS are representative at the national level and for the urban and rural areas separately. The survey estimates are also representative for the four provinces of Punjab, Sindh, Khyber Pakhtunkhwa, and Balochistan; for two regions including AJK and Gilgit Baltistan (GB); for Islamabad Capital Territory (ICT); and for FATA. In total, there are 13 secondlevel survey domains.
The 2017-18 PDHS followed a stratified two-stage sample design. The stratification was achieved by separating each of the eight regions into urban and rural areas. In total, 16 sampling strata were created. Samples were selected independently in every stratum through a two-stage selection process. Implicit stratification and proportional allocation were achieved at each of the lower administrative levels by sorting the sampling frame within each sampling stratum before sample selection, according to administrative units at different levels, and by using a probability-proportional-to-size selection at the first stage of sampling.
The first stage involved selecting sample points (clusters) consisting of EBs. EBs were drawn with a probability proportional to their size, which is the number of households residing in the EB at the time of the census. A total of 580 clusters were selected.
The second stage involved systematic sampling of households. A household listing operation was undertaken in all of the selected clusters, and a fixed number of 28 households per cluster was selected with an equal probability systematic selection process, for a total sample size of approximately 16,240 households. The household selection was carried out centrally at the NIPS data processing office. The survey teams only interviewed the pre-selected households. To prevent bias, no replacements and no changes to the pre-selected households were allowed at the implementing stages.
For further details on sample design, see Appendix A of the final report.
A total of 15,671 households were selected for the survey, of which 15,051 were occupied. The response rates are presented separately for Pakistan, Azad Jammu and Kashmir, and Gilgit Baltistan. Of the 12,338 occupied households in Pakistan, 11,869 households were successfully interviewed, yielding a response rate of 96%. Similarly, the household response rates were 98% in Azad Jammu and Kashmir and 99% in Gilgit Baltistan.
In the interviewed households, 94% of ever-married women age 15-49 in Pakistan, 97% in Azad Jammu and Kashmir, and 94% in Gilgit Baltistan were interviewed. In the subsample of households selected for the male survey, 87% of ever-married men age 15-49 in Pakistan, 94% in Azad Jammu and Kashmir, and 84% in Gilgit Baltistan were successfully interviewed.
Overall, the response rates were lower in urban than in rural areas. The difference is slightly less pronounced for Azad Jammu and Kashmir and Gilgit Baltistan. The response rates for men are lower than those for women, as men are often away from their households for work.
Due to non-proportional sample allocation, the sample was not self-weighting. Weighting factors have been calculated, added to the data file, and applied so that results are representative at the national level for Pakistan (including FATA and ICT Islamabad) and separately for Azad Jammu and Kashmir and Gilgit Baltistan.
A spreadsheet containing all sampling parameters and selection probabilities was prepared to facilitate the calculation of the design weights. Design weights were adjusted for cluster level non-response, household level non-response, and for individual non-response to get the sampling weights for women’s and men’s surveys respectively. The differences of the household sampling weights and the individual sampling weights are introduced by individual nonresponse. The final sampling weights were normalised in order to get the total number of unweighted cases equal to the total number of weighted cases at national level, for both household weights and individual weights, respectively. There are four sets of weights to be calculated:
- one set for all households selected for the survey
- one set for women selected for individual survey
- one set for households selected for the male survey
- one set for the male individual survey
- one set for the domestic violence survey
It is important to note that the normalised weights are relative weights, which are valid for estimating means, proportions and ratios, but not valid for estimating population totals nor for pooled data. Also the number of weighted cases by using the normalised weight has no direct relation with the survey precision because it is relative, especially for oversampled areas. The number of weighted cases is much smaller than the number of unweighted cases; the latter is directly related to survey precision.
For further details on sampling weights, see Appendix A.4 of the final report.
Dates of collection
Mode of data collection
Six questionnaires were used in the 2017-18 PDHS: Household Questionnaire, Woman’s Questionnaire, Man’s Questionnaire, Biomarker Questionnaire, Fieldworker Questionnaire, and the Community Questionnaire. The first five questionnaires, based on The DHS Program’s standard Demographic and Health Survey (DHS-7) questionnaires, were adapted to reflect the population and health issues relevant to Pakistan. The Community Questionnaire was based on the instrument used in the previous rounds of the Pakistan DHS. Comments were solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. The survey protocol was reviewed and approved by the National Bioethics Committee, Pakistan Health Research Council, and ICF Institutional Review Board. After the questionnaires were finalised in English, they were translated into Urdu and Sindhi. The 2017-18 PDHS used paper-based questionnaires for data collection, while computerassisted field editing (CAFE) was used to edit the questionnaires in the field.
The processing of the 2017-18 PDHS data began simultaneously with the fieldwork. As soon as data collection was completed in each cluster, all electronic data files were transferred via IFSS to the NIPS central office in Islamabad. These data files were registered and checked for inconsistencies, incompleteness, and outliers. The field teams were alerted to any inconsistencies and errors. Secondary editing was carried out in the central office, which involved resolving inconsistencies and coding the openended questions. The NIPS data processing manager coordinated the exercise at the central office. The PDHS core team members assisted with the secondary editing. Data entry and editing were carried out using the CSPro software package. The concurrent processing of the data offered a distinct advantage as it maximised the likelihood of the data being error-free and accurate. The secondary editing of the data was completed in the first week of May 2018. The final cleaning of the data set was carried out by The DHS Program data processing specialist and completed on 25 May 2018.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2017-18 Pakistan Demographic and Health Survey (2017-18 PDHS) to minimise this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2017-18 PDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2017-18 PDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed by SAS programmes developed by ICF. These programmes use the Taylor linearisation method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
The Taylor linearisation method treats any percentage or average as a ratio estimate, r = y/x, where y represents the total sample value for variable y, and x represents the total number of cases in the group or subgroup under consideration.
A more detailed description of estimates of sampling errors are presented in Appendix B of the survey final report.
Other forms of data appraisal
Data Quality Tables
- Household age distribution
- Household age distribution – Azad Jammu and Kashmir
- Household age distribution – Gilgit Baltistan
- Age distribution of eligible and interviewed women
- Age distribution of eligible and interviewed women – Azad Jammu and Kashmir
- Age distribution of eligible and interviewed women – Gilgit Baltistan
- Age distribution of eligible and interviewed men
- Age distribution of eligible and interviewed men – Azad Jammu and Kashmir
- Age distribution of eligible and interviewed men – Gilgit Baltistan
- Completeness of reporting
- Completeness of reporting – Azad Jammu and Kashmir
- Completeness of reporting – Gilgit Baltistan
- Births by calendar years
- Births by calendar years – Azad Jammu and Kashmir
- Births by calendar years – Gilgit Baltistan
- Reporting of age at death in days
- Reporting of age at death in days – Azad Jammu and Kashmir
- Reporting of age at death in days – Gilgit Baltistan
- Reporting of age at death in months
- Reporting of age at death in months – Azad Jammu and Kashmir
- Reporting of age at death in months – Gilgit Baltistan
- Height and weight data completeness and quality for children
- Height and weight data completeness and quality for children – Azad Jammu and Kashmir
- Height and weight data completeness and quality for children – Gilgit Baltistan
See details of the data quality tables in Appendix C of the survey final report.
The DHS Program
Request Dataset Access
The following applies to DHS, MIS, AIS and SPA survey datasets (Surveys, GPS, and HIV).
To request dataset access, you must first be a registered user of the website. You must then create a new research project request. The request must include a project title and a description of the analysis you propose to perform with the data.
The requested data should only be used for the purpose of the research or study. To request the same or different data for another purpose, a new research project request should be submitted. The DHS Program will normally review all data requests within 24 hours (Monday - Friday) and provide notification if access has been granted or additional project information is needed before access can be granted.
DATASET ACCESS APPROVAL PROCESS
Access to DHS, MIS, AIS and SPA survey datasets (Surveys, HIV, and GPS) is requested and granted by country. This means that when approved, full access is granted to all unrestricted survey datasets for that country. Access to HIV and GIS datasets requires an online acknowledgment of the conditions of use.
A dataset request must include contact information, a research project title, and a description of the analysis you propose to perform with the data.
A few datasets are restricted and these are noted. Access to restricted datasets is requested online as with other datasets. An additional consent form is required for some datasets, and the form will be emailed to you upon authorization of your account. For other restricted surveys, permission must be granted by the appropriate implementing organizations, before The DHS Program can grant access. You will be emailed the information for contacting the implementing organizations. A few restricted surveys are authorized directly within The DHS Program, upon receipt of an email request.
When The DHS Program receives authorization from the appropriate organizations, the user will be contacted, and the datasets made available by secure FTP.
GPS/HIV Datasets/Other Biomarkers
Once downloaded, the datasets must not be passed on to other researchers without the written consent of The DHS Program. All reports and publications based on the requested data must be sent to The DHS Program Data Archive in a Portable Document Format (pdf) or a printed hard copy.
Datasets are made available for download by survey. You will be presented with a list of surveys for which you have been granted dataset access. After selecting a survey, a list of all available datasets for that survey will be displayed, including all survey, GPS, and HIV data files. However, only data types for which you have been granted access will be accessible. To download, simply click on the files that you wish to download and a "File Download" prompt will guide you through the remaining steps.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Information about The DHS Program
The DHS Program
The DHS Program
Data and Data Related Resources
The DHS Program
Development Economics Data Group
The World Bank
Documentation of the DDI
Version 01 (February 2019). Metadata is excerpted from "Pakistan Demographic and Health Survey 2017-18" Report.