Living Standards Survey V 2005-2006 - World Bank SHIP Harmonized Dataset
Survey based Harmonized Indicators (SHIP) files are harmonized data files from household surveys that are conducted by countries in Africa. To ensure the quality and transparency of the data, it is critical to document the procedures of compiling consumption aggregation and other indicators so that the results can be duplicated with ease. This process enables consistency and continuity that make temporal and cross-country comparisons consistent and more reliable.
Four harmonized data files are prepared for each survey to generate a set of harmonized variables that have the same variable names. Invariably, in each survey, questions are asked in a slightly different way, which poses challenges on consistent definition of harmonized variables. The harmonized household survey data present the best available variables with harmonized definitions, but not identical variables. The four harmonized data files are
a) Individual level file (Labor force indicators in a separate file): This file has information on basic characteristics of individuals such as age and sex, literacy, education, health, anthropometry and child survival.
b) Labor force file: This file has information on labor force including employment/unemployment, earnings, sectors of employment, etc.
c) Household level file: This file has information on household expenditure, household head characteristics (age and sex, level of education, employment), housing amenities, assets, and access to infrastructure and services.
d) Household Expenditure file: This file has consumption/expenditure aggregates by consumption groups according to Purpose (COICOP) of Household Consumption of the UN.
Kind of data
Sample survey data [ssd]
The SHIP datasets contains harmonized variables, produced by Africa Region, Office of the Chief Economist (AFRCE) based on the raw data from houshold surveys conducted in African countries.
The data provided by the National Statistical Office is harmonized across countries and across time using a standard SHIP methodology.
Unit of analysis
- Individual level for datasets with suffix _I and _L
- Household level for datasets with suffix _H and _E
The survey covered all de jure household members (usual residents).
Producers and sponsors
Ghana Statistical Service (GSS)
Office of the Chief Economist - Africa Region
Sampling Frame and Units
As in all probability sample surveys, it is important that each sampling unit in the surveyed population has a known, non-zero probability of selection. To achieve this, there has to be an appropriate list, or sampling frame of the primary sampling units (PSUs).The universe defined for the GLSS 5 is the population living within private households in Ghana. The institutional population (such as schools, hospitals etc), which represents a very small percentage in the 2000 Population and Housing Census (PHC), is excluded from the frame for the GLSS 5.
The Ghana Statistical Service (GSS) maintains a complete list of census EAs, together with their respective population and number of households as well as maps, with well defined boundaries, of the EAs. . This information was used as the sampling frame for the GLSS 5. Specifically, the EAs were defined as the primary sampling units (PSUs), while the households within each EA constituted the secondary sampling units (SSUs).
In order to take advantage of possible gains in precision and reliability of the survey estimates from stratification, the EAs were first stratified into the ten administrative regions. Within each region, the EAs were further sub-divided according to their rural and urban areas of location. The EAs were also classified according to ecological zones and inclusion of Accra (GAMA) so that the survey results could be presented according to the three ecological zones, namely 1) Coastal, 2) Forest, and 3) Northern Savannah, and for Accra.
Sample size and allocation
The number and allocation of sample EAs for the GLSS 5 depend on the type of estimates to be obtained from the survey and the corresponding precision required. It was decided to select a total sample of around 8000 households nationwide.
To ensure adequate numbers of complete interviews that will allow for reliable estimates at the various domains of interest, the GLSS 5 sample was designed to ensure that at least 400 households were selected from each region.
A two-stage stratified random sampling design was adopted. Initially, a total sample of 550 EAs was considered at the first stage of sampling, followed by a fixed take of 15 households per EA. The distribution of the selected EAs into the ten regions or strata was based on proportionate allocation using the population.
For example, the number of selected EAs allocated to the Western Region was obtained as: 1924577/18912079*550 = 56
Under this sampling scheme, it was observed that the 400 households minimum requirement per region could be achieved in all the regions but not the Upper West Region. The proportionate allocation formula assigned only 17 EAs out of the 550 EAs nationwide and selecting 15 households per EA would have yielded only 255 households for the region. In order to surmount this problem, two options were considered: retaining the 17 EAs in the Upper West Region and increasing the number of selected households per EA from 15 to about 25, or increasing the number of selected EAs in the region from 17 to 27 and retaining the second stage sample of 15 households per EA.
The second option was adopted in view of the fact that it was more likely to provide smaller sampling errors for the separate domains of analysis. Based on this, the number of EAs in Upper East and the Upper West were adjusted from 27 and 17 to 40 and 34 respectively, bringing the total number of EAs to 580 and the number of households to 8,700.
A complete household listing exercise was carried out between May and June 2005 in all the selected EAs to provide the sampling frame for the second stage selection of households. At the second stage of sampling, a fixed number of 15 households per EA was selected in all the regions. In addition, five households per EA were selected as replacement samples.The overall sample size therefore came to 8,700 households nationwide.
Data collected by the survey was harmonized in accordance with the guidelines in the Africa SHIP manual. Program with suffix _I extracts individual level information, _L extracts labor and time use information, _H extracts household information and _E extracts expenditure information.
Office of the Chief Economist, Africa Region
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the Identification of the SHIP Harmonized Dataset producer
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the producer of the SHIP Harmonized Dataset, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.