In order to develop an effective poverty reduction policies and programs, Iraqi policy makers need to know how large the poverty problem is, what kind of people are poor, and what are the causes and consequences of poverty. Until recently, they had neither the data nor an official poverty line. (The last national income and expenditure survey was in 1988.)
In response to this situation, the Iraqi Ministry of Planning and Development Cooperation established the Household Survey and Policies for Poverty Reduction Project in 2006, with financial and technical support of the World Bank. The project has been led by the Iraqi Poverty Reduction Strategy High Committee, a group which includes representatives from Parliament, the prime minister’s office, the Kurdistan Regional Government, and the ministries of Planning and Development Cooperation, Finance, Trade, Labor and Social Affairs, Education, Health, Women’s Affairs, and Baghdad University.
The Project has consisted of three components:
- Collection of data which can provide a measurable indicator of welfare, i.e.the Iraq Household Socio Economic Survey (IHSES).
- Establishment of an official poverty line (i.e. a cut off point below which people are considered poor) and analysis of poverty (how large the poverty problem is, what kind of people are poor and what are the causes and consequences of poverty).
- Development of a Poverty Reduction Strategy, based on a solid understanding of poverty in Iraq.
Kind of data
Sample survey data [ssd]
Unit of analysis
Version 02: Individual datasets have been modified so that they can more easily be compared to the 2012 datasets. In the new version, the variable names were renamed. Note that, consumption aggregate was also recalculated.
Domains: Urban/rural/metropolitan; governorates
Unit of analysis
Producers and sponsors
Central Organization for Statistics and Information Technology (COSIT)
Kurdistan Regional Statistics Office (KRSO)
The World Bank
Technical and capacity building in all phases of the survey.
Government of Iraq
Funded the study
Multi-country trust fund
Funded the study
The World Bank
Funded the study
Total sample size and stratification
The total effective sample size of the IHSES 2007 is 17,822 households. The survey was nominally designed to visit 18,144 households - 324 in each of 56 major strata. The strata are the rural, urban and metropolitan sections of each of Iraq's 18 governorates, with the exception of Baghdad, which has three metropolitan strata. The IHSES 2007 and the MICS 2006 survey intended to visit the same nominal sample. Variable q0040 indicates whether this was indeed the case.
Sampling strategy and sampling stages
The sample was selected in two stages, with groups of majals (Census Enumeration Areas) as Primary Sampling Units (PSUs) and households as Secondary Sampling Units. In the first stage, 54 PSUs were selected with probability proportional to size (pps) within each stratum, using the number of households recorded by the 1997 Census as a measure of size. In the second stage, six households were selected by systematic equal probability sampling (seps) within each PSU. To these effects, a cartographic updating and household listing operation was conducted in 2006 in all 3,024 PSUs, without resorting to the segmentation of any large PSUs. The total sample is thus nominally composed of 6 households in each of 3,024 PSUs.
Trios, teams and survey waves
The PSUs selected in each governorate (270 in Baghdad and 162 in each of the other governorates) were sorted into groups of three neighboring PSUs called trios -- 90 trios in Baghdad and 54 per governorate elsewhere. The three PSUs in each trio do not necessarily belong to the same stratum.
The 12 months of the data collection period were divided into 18 periods of 20 or 21 days called survey waves. Fieldworkers were organized into teams of three interviewers, each team being responsible for interviewing one trio during a survey wave. The survey used 56 teams in total - 5 in Baghdad and 3 per governorate elsewhere. The 18 trios assigned to each team were allocated into survey waves at random.
The 'time use' module was administered to two of the six households selected in each PSU: nominally the second and fifth households selected by the seps procedure in the PSU.
(For a formatted version of this field, see "IHSES sampling design and sampling weights.pdf" in "External Resources".)
(For a map of Iraq's governorates and districts, see "Iraq governorates and districts.pdf" in "External Resources".)
Deviations from sample design
The design did not consider the replacement of any of the randomly selected units (PSUs or households.) However, certain emergency procedures were defined to deal with security situations: If a survey team was unable to visit a trio of PSUs in the originally allocated wave, that trio was to be swapped with the trio from a randomly selected future wave that was secure at the time. If none of the still unvisited trios was secure, one of the secure trios already visited was randomly selected instead, and the team visited in each of its PSUs a new seps sample of six households - different from those interviewed when the trio was visited the first time.
This explains why the survey datasets only contain data from 2,876 of the 3,024 originally selected PSUs, whereas 55 of the PSUs contain more that the six households nominally dictated by the design.
The wave number in the survey datasets is always the nominal wave number, corresponding to the random allocation considered by the design. The effective interview dates can be found in questions 35 to 39 of the survey questionnaires.
Practice deviated from the designed procedures in two cases: In one of the governorates (Suleimaniya,) the survey was fielded for an additional two waves (waves 19 and 20,) in order to visit an extra 18 PSUs, selected from certain metropolitan areas that were not included in the original sample frame. These areas are to be analyzed jointly with the rest of metropolitan Suleimaniya, but from a sampling standpoint they constitute a de facto fourth stratum in the governorate. In another governorate (Kirkuk,) local managers used their judgment rather than the established procedures to select 12 replacement PSUs. To identify the 30 PSUs resulting from these deviations in the survey datasets, their original 'cluster numbers' (ranging from 0001 to 3024) were increased by 5000.
Baghdad has three metropolitan strata by design, whereas an additional metropolitan stratum appeared in Suleimaniya for reasons explained in the field "Deviations from Sample Design".
In Kirkuk the response rate is lower than average in the rural stratum and higher that 100 percent in the metropolitan stratum as a result of the special replacement procedures used there (certain unsecure rural PSUs were replaced by metropolitan PSUs -- see field "Deviations from Sample Design".)
The selection probability p[hij] of household hij in PSU hi of stratum h is given by
p[hij] = k[h] n[hi] m[hi] / N[h] n'[hi]
k[h] is the number of PSUs selected in stratum h;
n[hi] is the number of households in PSU hi, as per the 1997 Census;
N[h] is the total number of households in stratum h (also as per the 1997 Census;)
m[hi] is the number of households selected in PSU hi; and
n'[hi] is the number of households in PSU hi, as per the 2006 listing operation.
k[h] is always 54, except in the extra metropolitan stratum in Suleimaniya (18 PSUs,) and in the three Kirkuk strata (55 rural PSUs, 55 urban PSUs, and 64 metropolitan PSUs -- see field "Deviations from Sample Design.")
The nominal value of m[hi] is 2 for the time use module and 6 for all other modules.
The 'probability weight' w[hij] of househild hij is the inverse of its selection probability p[hij].
In the survey datasets, the probability weights so obtained were affected by governorate-wise coefficients intended to have the estimated populations match the corresponding projections used by the national food ration system.
(For a formatted version of this field, see "Doc\IHSES sampling design and sampling weights.pdf" in "External Resources".)
(For an Excel spreadsheet that calculates the weights, see "Doc\Weight calculations5.xls" in "External Resources".)
Dates of collection
Initially planned data collection period
Extension of data collection in Kurdistan region
Mode of data collection
Data collection supervision
The interviewers were supervised by 56 local supervisors along with regular supervision visits by central supervisors.
The questionnaire was designed by COSIT in continuous consultation with the WB consultants. It is composed of 18 sections covering household characteristics, government ration, housing, education, health, recreation facilities, employment, expenditure and income, transfers and risks along with the diary and time use. A pre-test of the questionnaire was conducted at an early stage of the project in a small number of households with different characteristics in some governorates.
To facilitate its administration, the questionnaire was divided into 5 physical booklets called "forms". Form 1 gathers socio economic information on household members and housing; Form 2 is to record non food expenditures, Form 3 is for employment, transfers and others;
Form 4 is the diary used to record household's food purchases during 10 days and finally Form 5 with the time use sheet administered to one third of the households in the sample.
All forms where produced in three languages: Arabic, Kurdish and English (all available in "External Resources").
Data editing took place at a number of stages throughout the processing, including:
1. Office editing by local supervisors.
2. Based on the validation rules incorporated in the data entry program (CSPro), rejection reports were produced, based on which data are corrected.
3. Structural checking of SPSS data files.
4. Automatic fixing programme at the analysis phase.
Detailed documentation of the editing of data can be found in the "Data processing guidelines" document provided as an external resource.
The data collected in the field was entered, wave after wave, separately in each governorate. All the rejections issued by the entry programs were dealt with within each team. At the end of each of the 18 waves, the data was sent to (or centrally picked up from) the Data Management Team (DMT), which re-checked the information and sent back for fixing any incomplete or unacceptable data.
Then, the final consolidated data for a wave was exported to SPSS into a set of files delivered to the Data Analysis Unit (DAU) in a pack known as "generation 1" of the wave. DAU identified specific issues for the data and requested further fixes from DMT of cleaned up the outliers and unacceptable cases. This activity produced a "generation 2" of the SPSS databases, which was used as input for adding variables such as expenditure and income aggregates, new classifications of households and persons, including unemployment descriptors, for producing a "generation 3". The latter was used for creating a last "generation 4" of the databases, adding consumption aggregates, the classification of households by poverty status and other poverty-related variables.
To deal with all the data management responsibilities, the DMT produced or acquired a number of software tools for better supporting the project.
The core piece of software, a data entry program (developed in CSPro 3.01), allowed entry operators to enter and validate the information collected in the field, with strong consistency checks for improving the quality of the data. Main controls included: (1) ranges for numeric variables, (2) demographic consistency within the household including full control on education, health and labor data, (3) check unitary values and measurement units for acquired items, (4) extensive use of control subtotals for critical sections, (5) check the household metadata against the sample, and (6) balance of calories per capita based on food transactions. The screens and error messages were displayed in three languages (Arabic, Kurdish and English) depending on the choice of the data entry operator.
Time use sheets collected for 1/3 of the surveyed households were converted into text files using scanners in each governorate. In spite the difficulties opposed by the variety of formats and scan devices available, scanning was the only choice for recording the activities declared by the interviewees at a scale of one quarter hour along 24 hours a day.
An export module, also in CSPro, was included for transferring data into SPSS and Stata. During the export process, the same consistency checks of the data entry program were run again, plus other controls that checked the completion of the work in each governorate after each wave. The scripted export module reduced the data to just 12 interlinkable files.
Friendly menus written in Visual Basic allowed for a simplified utilization of the different components of the entry tool.
Starting 7th wave, the data files of some governorates could be accessed and retrieved from a central location using remote internet access via LogMeIn. Remaining governorates kept sending their files by email, since there ware technical problems that the data management team could not solve for security constraints.6. Processing ends when data has been verified by both Data Management and DAU
The estimation of standard errors must account for the design features explained in the "Sampling" field. (See also "IHSES sampling design and sample weights" in "External Resources.")
The following variables, included in all datasets, are needed for the estimation of standard errors:
xweight : sampling weight
xstrat: sampling stratum
xcluster: primary sampling unit
Warning: Variable 'xbeea', also present in all datasets, identifies rural, urban and metropolitan environments for tabulation purposes; it is sometimes wrongly referred to as 'stratum', but it should not be used for the estimation of sampling errors. The variable that needs to be used for these purposes is 'xstrat', which identifies the 57 sampling strata, defined as the rural, urban and metropolitan sectors of each of each of the 18 governorates, with the exception of Baghdad (which has three metropolitan sectors,) and Suleimaniya (which has two.)
In receiving these data it is recognized that the data are supplied for use within your organization, and you agree to the following stipulations as conditions for the use of the data:
1. The data are supplied solely for the use described in this form and will not be made available to other organizations or individuals. Other organizations or individuals may request the data directly.
2. Three copies of all publications, conference papers, or other research reports based entirely or in part upon the requested data will be supplied to:
Central Organization for Statistics and Information Technology (COSIT)
The World Bank, Development Economics Research Group
LSMS Database Administrator
1818 H Street, NW
Washington, DC 20433, USA
tel: (202) 473-9041
fax: (202) 522-1153
3. The researcher will refer to the 2006-07 Iraq Household Socio-Economic Survey as the source of the information in all publications, conference papers, and manuscripts. At the same time, the World Bank is not responsable for the estimations reported by the analyst(s).
4. Users who download the data may not pass the data to third parties.
5. The database cannot be used for commercial ends, nor can it be sold.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
LSMS Data Manager
The World Bank
Development Data Group
The World Bank
Documentation of the DDI
Version 02 (July 23, 2015). The DDI has been updated using revised survey data.