The GHS is an annual household survey, specifically designed to measure various aspects of the living circumstances of South African households. The key findings reported here focus on the five broad areas covered by the GHS, namely: education, health, activities related to work and unemployment, housing and household access to services and facilities.
Kind of data
Sample survey data [ssd]
v1.2 Edited, anonymised dataset for public distribution
Version 1 of GHS 2009 dataset was released with weights which take into account the findings of the Community Survey 2007 and new HIV/AIDS and mortality data, and (b) the adjusted provincial boundaries that came into effect in December 2006.
Version 1.1 of the dataset was downloaded from Statistics South Africa's website in August 2011.
Version 1.1 differs from version 1 of the dataset in the following ways: there are mismatches between variables “q419rem” and “totmhinc” in the Household data. The derived variable for disability categories (“undisab”) in the household file of the earlier version (1) does not exit in this later version (1.1)
A derived variable for salary categories (“Q142msalcat”) has been included in the Person file for version 1.1.
in-job training [3.2]
labour relations/conflict [3.3]
working conditions [3.6]
LABOUR AND EMPLOYMENT 
TRADE, INDUSTRY AND MARKETS 
DEMOGRAPHY AND POPULATION 
The scope of the General Household Survey 2009 was national coverage.
The lowest level of geographic aggregations covered by the General Household Survey 2009 is Province.
Unit of analysis
The units of anaylsis for the General Household Survey 2009 are individuals and households.
The survey covered all de jure household members (usual residents) of households in the nine provinces of South Africa and residents in workers' hostels. The survey does not cover collective living quarters such as students' hostels, old age homes, hospitals, prisons and military barracks.
Producers and sponsors
Statistics South Africa
The sample design for the GHS 2009 was based on a master sample (MS) that was originally designed for the QLFS and was used for the first time for the GHS in 2008. This master sample is shared by the Quarterly Labour Force Surveys (QLFS), General Household Survey (GHS), Living Conditions Survey (LCS), Domestic Tourism Survey and the Income and Expenditure Surveys (IES).
The master sample used a two-stage, stratified design with probability–proportional-to-size (PPS) sampling of PSUs from within strata, and systematic sampling of dwelling units (DUs) from the sampled primary sampling units (PSUs). A self-weighting design at provincial level was used and MS stratification was divided into two levels. Primary stratification was defined by metropolitan and non-metropolitan geographic area type. During secondary stratification, the Census 2001 data were summarised at PSU level. The following variables were used for secondary stratification; household size, education, occupancy status, gender, industry and income.
Census enumeration areas (EAs) as delineated for Census 2001 formed the basis of the PSUs. The following additional rules were used:
• Where possible, PSU sizes were kept between 100 and 500 dwelling units (DUs);
• EAs with fewer than 25 DUs were excluded;
• EAs with between 26 and 99 DUs were pooled to form larger PSUs and the criteria used was same settlement type;
• Virtual splits were applied to large PSUs: 500 to 999 split into two; 1 000 to 1 499 split into three; and 1 500 plus split into four PSUs; and
• Informal PSUs were segmented.
A Randomised Probability Proportional to Size (RPPS) systematic sample of PSUs was drawn in each stratum, with the measure of size being the number of households in the PSU. Altogether approximately 3 080 PSUs were selected. In each selected PSU a systematic sample of dwelling units was drawn. The number of DUs selected per PSU varies from PSU to PSU and depends on the Inverse Sampling Ratios (ISR) of each PSU.
Dates of collection
Mode of data collection
The GHS 2009 questionnaire collected data on:
Household characteristics: Dwelling type, home ownership, access to water and sanitation facilities, access to services, transport, household assets, land ownership, agricultural production
Individuals' characteristics: demographic characteristics, relationship to household head, marital status, language, education, employment, income, health, disability, access to social services, mortality.
Women's characteristics: fertility
Estimation and use of standard error
The published results of the General Household Survey are based on representative probability samples drawn from the South African population, as discussed in the section on sample design. Consequently, all estimates are subject to sampling variability. This means that the sample estimates may differ from the population figures that would have been produced if the entire South African population had been included in the survey. The measure usually used to indicate the probable difference between a sample estimate and the corresponding population figure is the standard error (SE), which measures the extent to which an estimate might have varied by chance because only a sample of the population was included. There are two major factors which influence the value of a standard error. The first factor is the sample size. Generally speaking, the larger the sample size, the more precise the estimate and the smaller the standard error. Consequently, in a national household survey such as the GHS, one expects more precise estimates at the national level than at the provincial level due to the larger sample size involved. The second factor is the variability between households of the parameter of the population being estimated, for example, the number of unemployed persons in the household.
University of Cape Town
The GHS 2009 dataset is a licensed dataset, accessible under conditions.
Publications based on datasets distributed by DataFirst should acknowledge relevant sources by means of bibliographic citations. To ensure that such source attributions are captured for social science bibliographic utilities, citations must appear in footnotes or in the reference section of publications. The bibliographic citation for this dataset is:
General Household Survey 2009 [microdata files]. Pretoria: Statistics South Africa [producer], 2010. Cape Town: DataFirst [distributor],2011.
Disclaimer and copyrights
The information products and services of Statistics South Africa are protected in terms of the Copyright Act, 1978 (Act 98 of 1978). As the State President is the holder of State copyright, all organs of State enjoy unhindered use of the Department's information products and services, without a need for further permission to copy in terms of that copyright. Where a copy of the information is made available to any third party outside the State, the third party must be made aware of the existence of State copyright and ownership of the information by the State. The State (through Statistics SA) retains the full ownership of its information, products and services at all times; access to information does not give ownership of the information to the client.
The use of any data is subject to acknowledgement of Stats SA as the supplier and owner of copyright. Statistics South Africa (Stats SA) will not be liable for any damages or losses, except to the extent that such losses or damages are attributable to a breach by Stats SA of its obligations in terms of an existing agreement or to the negligence or wilful act or omissions of the Stats SA, its servants or agents, arising out of the supply of data and or digital products in terms of that agreement. The user indemnifies Stats SA against any claims of whatsoever nature (including legal costs) by third parties arising from the reformatting, restructuring, reprocessing and/or addition of the data, by the user.
Copyright 2010, Statistics South Africa
University of Cape Town
University of Cape Town
Version 1.2 - Adapted for use by the World Bank Microdata Library - changed study ID to match Microdata Library Standard