The Multiple Indicator Cluster Survey, Round 3 (MICS3) is the third round of MICS surveys, previously conducted around 1995 (MICS1) and 2000 (MICS2). Many questions and indicators are consistent and compatible with the prior round of MICS (MICS2) but less so with MICS1, although there have been a number of changes in definition of indicators between rounds. Details can be found by reviewing the indicator definitions.
The Multiple Indicator Cluster Survey (MICS) is a household survey programme developed by UNICEF to assist countries in filling data gaps for monitoring human development in general and the situation of children and women in particular. MICS is capable of producing statistically sound, internationally comparable estimates of social indicators. The current round of MICS is focused on providing a monitoring tool for the Millennium Development Goals (MDGs), the World Fit for Children (WFFC), as well as for other major international commitments, such as the United Nations General Assembly Special Session (UNGASS) on HIV/AIDS and declarations issued by the League of Arab States and related institutions and organizations concerned about child rights in Arab countries, and the Cairo Declaration “Towards an Arab World Fit for Children”, and the Second Arab Work Plan for Children (2004-2015) that was adopted at the Arab Summits.
The 2006 Yemen Multiple Indicator Cluster Survey has as its primary objectives:
- To provide up-to-date information for assessing the situation of children and women in Yemen;
- To furnish data needed for monitoring progress toward goals established in the Millennium Declaration, the goals of A World Fit For Children (WFFC), and other internationally agreed upon goals, as a basis for future action;
- To contribute to the improvement of data and monitoring systems in Yemen and to strengthen technical expertise in the design, implementation, and analysis of such systems.
MICS questionnaires are designed in a modular fashion that can be easily customized to the needs of a country. They consist of a household questionnaire, a questionnaire for women aged 15-49 and a questionnaire for children under the age of five (to be administered to the mother or caretaker). Other than a set of core modules, countries can select which modules they want to include in each questionnaire.
The survey was implemented by the Ministry of Health and Population, with the support and assistance of UNICEF and PAPFAM. Technical assistance and training for the surveys was provided through a series of regional workshops, covering questionnaire content, sampling and survey implementation; data processing; data quality and data analysis; report writing and dissemination.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Households (defined as a group of persons who usually live and eat together)
De jure household members (defined as memers of the household who usually live in the household, which may include people who did not sleep in the household the previous night, but does not include visitors who slept in the household the previous night but do not usually live in the household) Ever-married
women aged 15-49
Children aged 0-4
Version 1.0: Edited data used for final report
The Yemen Multiple Indicator Cluster Survey included the following modules in the questionnaires:
HOUSEHOLD QUESTIONNAIRE : Household listing, Education, Water and Sanitation, Housing characteristics, Child labor, Child discipline, and Disability.
WOMEN'S QUESTIONNAIRE: Women's characteristics, Marriage, Child mortality, Birth history, Tetanus Toxid, Maternal and newborn health, Contraception and unmet need, and HIV and AIDS.
CHILDREN'S QUESTIONNAIRE: Children's characteristics, Birth registration and early education, Child development, Care for illness, and Immunization.
Water and sanitation
Maternal and newborn health
Contraception and unmet need
Care of illness
The survey is nationally represe
ntative and covers the whole of Yemen, excluding islands and the nomadic population.
The survey covered all de jure household members (usual residents), ever-married women aged 15-49 years resident in the household, and all children aged 0-4 years (under age 5) resident in the household.
Producers and sponsors
Ministry of Health
Pan Arab Project for Family Health
League of Arab States
Strategic Information Section, Division of Policy and Planning, UNICEF NYHQ
International technical assistance
UNICEF, Yemen Country Office
Funding of survey implementation and analyses
The Yemen MICS sample design was a two-stage stratified cluster sample. The following parameters were accounted for in designing the sample:
1 - The sample is to provide estimates with reasonable precision at national and urban/rural levels.
2 - The residents of the Yemeni islands and the nomadic population are excluded from survey coverage.
3 - The size of ultimate cluster is 20 households
4 - It is approximately self-weighted design.
The sample is allocated proportionally between urban and rural strata; the percentage of households that should be allocated to urban and rural areas was obtained from the 2004 Census. As the ultimate cluster is determined to be 20 households, the number of sample clusters is therefore 200. The proportional allocation of the sample is such that 142 for rural stratum and 58 for urban stratum.
The sample is to be selected in two stages. The Primary Sampling Unit (PSU) is a village (or a group of villages) in rural areas and a lane (hara) in urban. The micro data of the 2004 Census at these administrative levels has been relied upon to create frames for the first stage sample. The following provides a description of the sample selection in both stages:
First Stage Sample
The 2004 Census data (numbers of households and population) for all urban and rural agglomerations have been utilized to create appropriate frames for the first stage sample of urban and rural strata. It was taken into account that the PSU size would be in the range 150-300 households approximately. The creation of a rural frame has entailed grouping neighboring small villages so as to create PSUs in the range of 150-300 households each. Hence, a rural PSU is in most cases a group of small villages. The whole village is considered a PSU as long as its size is in the range 150-300 households.
The situation in urban areas is quite different from rural areas since most lanes (Haras) are much larger than the indicated range of the desired PSU size. For this reason, a second (dummy) sampling stage is necessary to reduce the burden of field listing whenever the lane size is above 300 households. The first urban stage sample included 41 PSU's that required division into equally sized parts. Whereas only 4 PSU's in the rural sample needed to be divided into equal parts.
An implicit stratification has been introduced in both rural and urban frames of the PSUs. Governorates were ordered geographically in a serpentine fashion starting from the northwest corner moving to the northeast corner and back to the west, then to the east and so on till the last governorate. Moreover, as governorate are further divided into a number of directorates (modyriate), another process of implicit stratification within each governorate was implemented by geographically ordering directorates following the same way as for governorates. Undoubtedly, implicit stratification will contribute to more precise sample estimates at both national and urban/rural levels.
The selection of rural and urban first stage samples was made following the Probability Proportionate to Size (PPS) selection method. The employed measure of size (MOS) is the number of Households in each PSU as measured in the 2004 Census.
Second stage sample
The selected PSU from the first sample stage, whether it was the whole PSU or a part of one, was updated in the field. A field operation was carried out in each PSU (or a part of it), which has been selected in the first stage sample so as to create an updated list of households for each sample PSU. These lists were used as sample frames for selecting the second stage sample.
The proposed selection method was determined in such a way so as to create compact ultimate clusters of 20 households in the rural sample, and non-compact ultimate cluster of the same size in the urban sample. The reason for selecting compact clusters for rural sample is that most of the rural sample PSU's are composed of several small villages which are, in most cases, located at the tops of adjacent mountains. The spread of the household sample over several small villages, within the same PSU, that would result from the systematic selection, would impose much difficulty in the main survey fieldwork. Hence it has been deemed operationally efficient to deal with the household list for each rural sample PSU as forming a circle. The selection of a single random number in the range of 1 - the total number of households in the list, will determine the entire household sample to be selected from the sample PSU. The household indicated by the selected random number and the subsequent 19 households in the list constitute the household sample to be selected from rural sample PSU's (keeping in mind the circular nature of the list).
In the case of the urban sample, however, an ordinary random systematic selection is suggested, so as to produce a non-compact cluster of 20 households. The households forming urban PSU (or a part of it) are not dispersed over a large area; hence the compact cluster is not justifiable.
Deviations from the Sample Design
No major deviations from the original sample design were made. All sample enumeration areas were accessed and successfully interviewed with good response rates.
Of the 3979 households selected for the sample, 3972 were found to be occupied. Of these, 3586 were successfully interviewed for a household response rate of 90.3 percent. In the interviewed households, 3912 ever-married women (age 15-49) were identified. Of these, 3742 were successfully interviewed, yielding a response rate of 95.7 percent. In addition, 3918 children under age five were listed in the household questionnaire. Questionnaires were completed for 3783 of these children, which corresponds to a response rate of 96.6 percent. Overall response rates of 86.4 and 87.2 are calculated for the women’s and under-5’s interviews respectively. Response rates were similar across urban and rural areas.
Sample weights were calculated for each of the datafiles.
Weights were used in deriving survey estimates to account for the expected differences between the updated household lists of the sample PSU and the Measure of Size (the 2004 number of households) as well as non-response which is inevitable in surveys of this nature. If non-response varies substantially over the sample PSU’s weights are needed for data tuning. The final weight is the product of design weight and non-response weight, where the design weight is the inverse of the overall selection probability and the non-response weight is the inverse of response rate.
Dates of Data Collection
Data Collection Mode
Interviewing was conducted by teams of interviewers. Each interviewing team comprised of 3-4 female interviewers, a field editor and a supervisor, and a driver. Each team used a 4 wheel drive vehicle to travel from cluster to cluster (and where necessary within cluster).
The role of the supervisor was to coordinator field data collection activities, including management of the field teams, supplies and equipment, finances, maps and listings, coordinate with local authorities concerning the survey plan and make arrangements for accomodation and travel. Additionally, the field supervisor assigned the work to the interviewers, spot checked work, maintained field control documents, and sent completed questionnaires and progress reports to the central office
The field editor was responsible for reviewing each questionnaire at the end of the day, checking for missed questions, skip errors, fields incorrectly completed, and checking for inconsistencies in the data. The field editor also observed interviews and conducted review sessions with interviewers.
Responsibilities of the supervisors and field editors are described in the Instructions for Supervisors and Field Editors, together with the different field controls that were in place to control the quality of the fieldwork.
Field visits were also made by a team of central staff on a periodic basis during fieldwork. The senior staff of Ministry of Public Health & Population also made 3 visits to field teams to provide support and to review progress.
Data Collection Notes
Training for the fieldwork was conducted for 2 weeks in August 2006. Training included lectures on interviewing techniques and the contents of the questionnaires, and mock interviews between trainees to gain practice in asking questions.
The data were collected by 16 teams; each team was comprised of 4 female interviewers, one driver, one male editor and a male supervisor. Fieldwork took place over one month in September 2006.
Interviews averaged 35 minutes for the household questionnaire, 23 minutes for the women's questionnaire, and 27 for the under five children's questionnaire. Interviews were conducted in Arabic.
Ministry of Public Health & Population
The questionnaires for the Yemen MICS were structured questionnaires based on the MICS3 Model Questionnaire with some modifications and additions. A household questionnaire was administered in each household, which collected various information on household members including sex, age, relationship, and orphanhood status. The household questionnaire includes household listing, education, water and sanitation, housing characteristics, child labor, child discipline and disability.
In addition to a household questionnaire, questionnaires were administered in each household for ever-married women age 15-49 and children under age five. For children, the questionnaire was administered to the mother or caretaker of the child.
The women's questionnaire include women's characteristics, marriage, child mortality, birth history, tetanus toxid, maternal and newborn health, contraception and unmet need, and HIV and AIDS modules.
The children's questionnaire includes children's characteristics, birth registration and early education, child development, care for illness, and immunization.
From the MICS3 model Arabic version, the questionnaires were pre-tested and based on the results of the pre-test, modifications were made to the wording and translation of the questionnaires. A copy of the Yemen MICS questionnaires is provided in Appendix F of the final report.
Data were processed in clusters, with each cluster being processed as a complete unit through each stage of data processing. Each cluster goes through the following steps:
1) Questionnaire reception
2) Office editing and coding
3) Data entry
4) Structure and completeness checking
5) Verification entry
6) Comparison of verification data
7) Back up of raw data
8) Secondary editing
9) Edited data back up
After all clusters are processed, all data is concatenated together and then the following steps are completed for all data files:
10) Export to SPSS in 5 files (hh - household, hl - household members, wm - women, bh - birth history, ch - children under 5)
11) Recoding of variables needed for analysis
12) Adding of sample weights
13) Calculation of wealth quintiles and merging into data
14) Structural checking of SPSS files
15) Data quality tabulations
16) Production of analysis tabulations
Details of each of these steps can be found in the data processing documentation, data editing guidelines, data processing programs in CSPro and SPSS, and tabulation guidelines.
The data was carried out by 11 data entry operators and 1 data entry supervisor. In order to ensure quality control, and internal consistency checks were performed. Procedures and standard programs developed under the global MICS3 project and adapted to the Yemen questionnaire were used throughout. Data processing began after data collection had been conducted in October 2006 and was completed in December 2006. All range checks and skips were controlled by the program and operators could not override these. A limited set of consistency checks were also included inthe data entry program. Open-ended responses ("Other" answers) were not entered or coded, except in rare circumstances where the response matched an existing code in the questionnaire.
Structure and completeness checking ensured that all questionnaires for the cluster had been entered, were structurally sound, and that women's and children's questionnaires existed for each eligible woman and child.
100% verification of all variables was performed using independent verification, i.e. double entry of data, with separate comparison of data followed by modification of one or both datasets to correct keying errors by original operators who first keyed the files.
After completion of all processing in CSPro, all individual cluster files were backed up before concatenating data together using the CSPro file concatenate utility.
Data editing took place at a number of stages throughout the processing (see Other processing), including:
a) Office editing and coding
b) During data entry
c) Structure checking and completeness
d) Secondary editing
e) Structural checking of SPSS data files
Detailed documentation of the editing of data can be found in the data processing guidelines
For tabulation and analysis SPSS versions 10.0 and 14.0 were used. Version 10.0 was originally used for all tabulation programs, except for child mortality. Later version 14.0 was used for child mortality, data quality tabulations and other analysis activities.
After transferring all files to SPSS, certain variables were recoded for use as background characteristics in the tabulation of the data, including grouping age, education, geographic areas as needed for analysis. In the process of recoding ages and dates some random imputation of dates (within calculated constraints) was performed to handle missing or "don't know" ages or dates. Additionally, a wealth (asset) index of household members was calculated using principal components analysis, based on household assets, and both the score and quintiles were included in the datasets for use in tabulations.
Estimates of Sampling Error
Estimates from a sample survey are affected by two types of errors: 1) non-sampling errors and 2) sampling errors. Non-sampling errors are the results of mistakes made in the implementation of data collection and data processing. Numerous efforts were made during implementation of the 2006 MICS to minimize this type of error, however, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors can be evaluated statistically. The sample of respondents to the 2006 MICS is only one of many possible samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differe somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability in the results of the survey between all possible samples, and, although, the degree of variability is not known exactly, it can be estimated from the survey results. The sampling erros are measured in terms of the standard error for a particular statistic (mean or percentage), which is the square root of the variance. Confidence intervals are calculated for each statistic within which the true value for the population can be assumed to fall. Plus or minus two standard errors of the statistic is used for key statistics presented in MICS, equivalent to a 95 percent confidence interval.
If the sample of respondents had been a simple random sample, it would have been possible to use straightforward formulae for calculating sampling errors. However, the 2006 MICS sample is the result of a multi-stage stratified design, and consequently needs to use more complex formulae. The SPSS complex samples module has been used to calculate sampling errors for the 2006 MICS. This module uses the Taylor linearization method of variance estimation for survey estimates that are means or proportions. This method is documented in the SPSS file CSDescriptives.pdf found under the Help, Algorithms options in SPSS.
Sampling errors have been calculated for a select set of statistics (all of which are proportions due to the limitations of the Taylor linearization method) for the national sample and urban and rural areas. For each statistic, the estimate, its standard error, the coefficient of variation (or relative error -- the ratio between the standard error and the estimate), the design effect, and the square root design effect (DEFT -- the ratio between the standard error using the given sample design and the standard error that would result if a simple random sample had been used), as well as the 95 percent confidence intervals (+/-2 standard errors).
A series of data quality tables and graphs are available to review the quality of the data and include the following:
Age distribution of household population
Age distribution of eligible and interviewed women
Age distribution of eligible children and children for whom the mother or caretaker was interviewed
Age distribution of children under age 5 by 3 month groups
Age and period ratios at boundaries of eligibility
Percent of observations with missing information on selected variables
Presence of mother in the household and person interviewed for the under 5 questionnaire
School attendance by single year age
Sex ratio at birth among children ever born, surviving and dead by age of respondent
The results of each of these data quality tables is shown in the appendix of the final report and is also given in the external resources section.
The general rule for presentation of missing data in the final report tabulations is that a column is presented for missing data if the percentage of cases with missing data is 1% or more. Cases with missing data on the background characteristics (e.g. education) are included in the tables, but the missing data rows are suppressed and noted at the bottom of the tables in the report (not in the SPSS output, however).
Users of the data agree to keep confidential all data contained in these datasets and to make no attempt to identify, trace or contact any individual whose data is included in these datasets.
Survey datasets are distributed at no cost for legitimate research, with the condition that we receive a description of the research objectives that will be using the data prior to authorizing their distribution.
Copies of all reports and publications based on the requested data must be sent to:
1. Dr. Mutahar Al-Abbasi, Deputy Minister of Planning and International Organization, Yemen
2. Dr. Nafisa Al-Jaifi, Secretary General of High Council of Motherhood and Childhood HCMC,Yemen
Requests for access to the datasets may be made through the website www.childinfo.org.
Ministry of Public Health & Population, Yemen. Multiple Indicator Cluster Survey: Household , household listing, women and children's files, 2006 [Computer file]. Yemen: Ministry of Public Health & Population [producer], 2006. Yemen: Ministry of Public Health & Population and New York: Strategic Information Section, Division of Policy and Planning, UNICEF [distributors], 2006.
Ministry of Public Health & Population and UNICEF provide these data to external users without any warranty or responsibility implied. Ministry of Public Health & Population and UNICEF accept no responsibility for the results and/or implications of any actions resulting from the use of these data.
DDI Document ID
Producer of Yemen MICS 2006 Archive
Pan Arab Project for Family Health
Producer of Yemen MICS 2006 Archive
Data Archiving Consultant
Producer of Yemen MICS 2006 Archive
Blancroft Research International
Producer of generic template
Customization of the Yemen MICS3 Archive for childinfo.org
Date of Metadata Production
DDI Document version
Yemen MICS 2006 v0.9
Slightly edited version of UNICEF's DDI ref. DDI-MoPH&P-YEM-MICS2006/1.0-v0.1