ZAF_2007_PHC_v01_M
Community Survey 2007
Name | Country code |
---|---|
South Africa | ZAF |
Sample survey data [ssd]
Households
v1.2: Edited, anonymised dataset for licensed distribution.
2011
Version 1 of the Community Survey 2007 dataset did not include fertility and mortality data (from sections F and I of the questionnaire respectively).
This version, version 1.1, downloaded from Statistics South Africa's website on 17 october 2011, includes fertility and mortality data. The metadata file provided with this version is the metadata supplied with version 1 and therefore does not cover the fertility and mortality data.
Geography variables, provided in a separate data file in version 1 of the dataset, but were included in the "Person", "Household" and "Mortality" files of version 1.1 of the dataset.
This version, version 1.2 includesthe changes made in version 1.1. However, the variables in version 1.1 were strings and in version 1.2 these have now been converted to numeric variables for ease of use.
The scope of the Community Survey (CS) includes:
Demographic characteristics (age and sex, population group, fertility, mortality), migration, economic activity, geographical distribution, marital status, disability, education, household good, access to services (social security services, housing, water, energy, sanitation, communication services, refuse removal).
Topic | Vocabulary | URI |
---|---|---|
fertility [14.2] | CESSDA | http://www.nesstar.org/rdf/common |
migration [14.3] | CESSDA | http://www.nesstar.org/rdf/common |
morbidity and mortality [14.4] | CESSDA | http://www.nesstar.org/rdf/common |
children [12.1] | CESSDA | http://www.nesstar.org/rdf/common |
social conditions and indicators [13.8] | CESSDA | http://www.nesstar.org/rdf/common |
economic conditions and indicators [1.2] | CESSDA | http://www.nesstar.org/rdf/common |
employment [3.1] | CESSDA | http://www.nesstar.org/rdf/common |
unemployment [3.5] | CESSDA | http://www.nesstar.org/rdf/common |
EDUCATION [6] | CESSDA | http://www.nesstar.org/rdf/common |
health care and medical treatment [8.5] | CESSDA | http://www.nesstar.org/rdf/common |
housing [10.1] | CESSDA | http://www.nesstar.org/rdf/common |
specific social services: use and provision [15.3] | CESSDA | http://www.nesstar.org/rdf/common |
The survey covered the whole of South Africa, including all nine provinces as well as the four settlement types - urban-formal, urban-informal, rural-formal (commercial farms) and rural-informal (tribal areas).
The Community Survey covered all de jure household members (usual residents) in South Africa. The survey excluded collective living quarters (institutions) and some households in EAs classified as recreational areas or institutions. However, an approximation of the out-of-scope population was made from the 2001 Census and added to the final estimates of the CS 2007 results.
Name |
---|
Statistics South Africa |
Name |
---|
The Government of South Africa |
Sample Design
The sampling procedure that was adopted for the CS was a two-stage stratified random sampling process. Stage one involved the selection of enumeration
areas, and stage tow was the selection of dwelling units.
Since the data are required for each local municipality, each municipality was considered as an explicit stratum. The stratification is done for those municipalities classified as category B municipalities (local municipalities) and category A municipalities (metropolitan areas) as proclaimed at the time of Census 2001. However, the newly proclaimed boundaries as well as any other higher level of geography such as province or district municipality, were considered as any other domain variable based on their link to the smallest geographic unit - the enumeration area.
The Frame
The Census 2001 enumeration areas were used because they give a full geographic coverage of the country without any overlap. Although changes in settlement type, growth or movement of people have occurred, the enumeration areas assisted in getting a spatial comparison over time. Out of 80 787 enumeration areas countrywide, 79 466 were considered in the frame. A total of 1 321 enumeration areas were excluded (919 covering institutions and 402 recreational areas).
On the second level, the listing exercise yielded the dwelling frame which facilitated the selection of dwellings to be visited. The dwelling unit is a structure or part of a structure or group of structures occupied or meant to be occupied by one or more households. Some of these structures may be vacant and/or under construction, but can be lived in at the time of the survey. A dwelling unit may also be within collective
living quarters where applicable (examples of each are a house, a group of huts, a flat, hostels, etc.).
The Community Survey universe at the second-level frame is dependent on whether the different structures are classified as dwelling units (DUs) or not. Structures where people stay/live were listed and classified as dwelling units. However, there are special cases of collective living quarters that were also included in the CS frame. These are religious institutions such as convents or monasteries, and guesthouses where people stay for an extended period (more than a month). Student residences - based on how long people have stayed (more than a month) - and old-age homes not similar to hospitals (where people are living in a communal set-up) were treated the same as hostels, thereby listing either the bed or room. In addition, any other family staying in separate quarters within the premises of an institution (like wardens' quarters, military family quarters, teachers' quarters and medical staff quarters) were considered as part of the CS frame. The inclusion of such group quarters in the frame is based on the living circumstances within these structures. Members are independent of each other with the exception that they sleep under one roof.
The remaining group quarters were excluded from the CS frame because they are difficult to access and have no stable composition. Excluded dwelling types were prisons, hotels, hospitals, military barracks, etc. This is in addition to the exclusion on first level of the enumeration areas (EAs) classified as institutions (military bases) or recreational areas (national parks).
The Selection of Enumeration Areas (EAs)
The EAs within each municipality were ordered by geographic type and EA type. The selection was done by using systematic random sampling. The criteria used were as follows:
In municipalities with fewer than 30 EAs, all EAs were automatically selected.
In municipalities with 30 or more EAs, the sample selection used a fixed proportion of 19% of all sampled EAs. However, if the selected EAs in a municipality were less than 30 EAs, the sample in the municipality was increased to 30 EAs.
The Selection of Dwelling Units
The second level of the frame required a full re-listing of dwelling units. The listing exercise was undertaken before the selection of DUs. The adopted listing methodology ensured that the listing route was determined by the lister. Thisapproach facilitated the serpentine selection of dwelling units. The listing exercise provided a complete list of dwelling units in the selected EAs. Only those structures that were classified as dwelling units were considered for selection, whether vacant or occupied. This exercise yielded a total of 2 511 314 dwelling units.
The selection of the dwelling units was also based on a fixed proportion of 10% of the total listed dwellings in an EA. A constraint was imposed on small-size EAs where, if the listed dwelling units were less than 10 dwellings, the selection was increased to 10 dwelling units. All households within the selected dwelling units were covered. There was no replacement of refusals, vacant dwellings or non-contacts owing to
their impact on the probability of selection.
Community Survey 2007 Response Rates
Total number of dwelling units - 274 348 dwelling units
Completed cases responding - 238 067 dwelling units (93,9%)
Non-response cases - 15393 dwelling units (6,1%)
Invalid or out-of-scope cases - 20 888 dwelling units
The Weights Calculation
The Community Survey sample has equal probabilities for all elements in the cluster which make it a self-weighting systematic random sample. Since the sample is stratified by municipalities as demarcated at the time of Census 2001, the inclusion probability of selection of an EA at the first level of selection, and the dwelling unit at the second level of selection, is the product of first and second-level probabilities. Also, since all households within the dwelling unit are considered, their probability of being in the dwelling unit is always one.
Consultation on Questionnaire Design
Ten stakeholder workshops were held across the country during August and September 2004. Approximately 367 stakeholders, predominantly from national, provincial and local government departments, as well as from research and educational institutions, attended. The workshops aimed to achieve two objectives, namely to better understand the type of information stakeholders need to meet their objectives, and to consider the proposed data items to be included in future household surveys. The output from this process was a set of data items relating to a
specific, defined focus area and outcomes that culminated with the data collection instrument (see Annexure B for all the data items).
Questionnaire Design
The design of the CS questionnaire was household-based and intended to collect information on 10 people. It was developed in line with the household-based survey questionnaires conducted by Stats SA. The questions were based on the data items generated out of the consultation process described above. Both the design and questionnaire layout were pre-tested in October 2005 and adjustments were made
for the pilot in February 2006. Further adjustments were done after the pilot results had been finalised.
Start | End |
---|---|
2007-02 | 2007-03 |
Name |
---|
Statistics South Africa |
Supervision
The data collection approach revolved around the use of a mobile team of four enumerators and a supervisor. The team was assigned a fixed number of EAs to enumerate. The team worked together in each sampled EA and moved to the next one once the targeted EA had been completed. The advantage of this method was that the supervisor was in daily contact with the team, which improved the quality of
the data collected during fieldwork. During enumeration, supervisors (who doubled up as drivers) and their teams of four
enumerators each, identified the selected EA. They then dropped off each enumerator at a selected dwelling unit, and ascertained that each enumerator had been accepted to conduct the interview. They picked up the enumerator who had completed the interview and immediately checked the questionnaire for errors, consistency and completeness. Where errors were found, the enumerator was sent back to the household to correct the information that had been recorded. If the supervisors were satisfied, they signed off the questionnaire and stored it in a safe place. Supervisors did the same for all the members of their teams until the EA had been completed. The team then moved to another selected EA.
Based on the number of sampled EAs (17 098), 1 182 teams comprising 1 182 Fieldwork Supervisors and 4 728 enumerators were formed. Each team was expected to enumerate about 16-20 EAs in three weeks, with an additional one week assigned for non-contacts and refusals. The supervisors were supervised by 236 Fieldwork Coordinators (FWCs), resulting in a Fieldwork Coordinator-to-Supervisor ratio of 1:5. Fieldwork Coordinators were supervised by 55 District Survey Coordinators (DSCs), resulting in a DSC-to-FWC ratio of 1:4. The DSCs were
supervised by nine Provincial Survey Coordinators (PSCs), resulting in a PSC-to-DSC ratio of approximately 1:6. The PSCs were based in their respective provincial offices. They coordinated data collection for their assigned province. The 55 DSCs were based in 55 district offices (DOs) that were temporarily created specifically for CS 2007. Each of the 236 FWCs had a temporary local office or fieldwork station that was used as a base during the fieldwork phase of the project, and also for the training of enumerators and supervisors attached to them.
Training of Fieldworkers
Training was planned and executed at national, provincial and district levels. The trainees at national level did the training at provincial level, and those that were trained at provincial level did the training at district level. The cascade method of training was at three levels. Training was initiated by subject matter specialists training the trainers at Head Office. The national trainers trained Provincial Survey Coordinators (PSCs), District Survey Coordinators (DSCs), Mapping Monitors (MMs) and GIS Officers at national level. The DSCs trained Fieldwork Coordinators (FWCs) at provincial level with the supervision and monitoring of the PSCs and Head Office monitors. Finally, FWCs trained Fieldwork
Supervisors (FWSs) and Enumerators at district level.
During the training of fieldworkers, video training technology was used in addition to the instructor-led training approach. Although video training can never replace the trainer completely, it offered an ideal opportunity to access large numbers of trainees, in different training venues, at different times, with customised training solutions, quickly and cost effectively. After facilitation by the trainer during training sessions, a training video was used to consolidate knowledge learnt and to clarify issues that were not clear to trainees during training.
Every training session was evaluated by both the trainees and trainers. Trainers completed a daily evaluation form in order to identify problems that trainees had experienced during that particular day’s training. Areas that needed remedial training were revisited the following day.
Enumeration
The main objective of enumeration is to collect and document particulars of all individuals and housing units with the selected respondent(s).
The adopted enumeration method for CS 2007 was canvassing, whereby the enumerator conducts a face-to-face interview with the respondent while simultaneously completing the questionnaire. The Community Survey adopted both the de jure and de facto approach in order to compare with other Stats SA social statistics definitions as well as to give a comparison over time between the censuses with the ultimate objective of having two estimates of the population – the de jure population estimates are mostly useful for long-term planning, and the de facto
population estimates are mostly used for demographic estimations.
Enumerators visited the selected sampled dwelling units to interview households and ensure that the information required from them was captured on the questionnaires. Self-enumeration was not allowed. The enumeration was carried out over a three week period with a non-response follow-up period of one week as planned, that is on 7 February 2007. The mop-up exercise was carried out from 1 to 7 March. This
included follow-up on non-contacts, vacant dwellings, and unoccupied dwellings. However, due to the high number of dwelling units that were being mapped for the non- response follow-up period, the contracts of enumerators were extended beyond 28 February to assist the supervisors during that period.
Quality Assurance
The FWS and FWC conducted 100% quality checks for accuracy and completeness on all completed questionnaires. In addition, the DSCs, PSCs and Monitors also did quality checks on randomly selected questionnaires and DUs and addressed problematic questions as they came up. In addition, the FWC did 2% spot checks of selected dwelling units within their assigned fieldwork coordination unit to minimise
bogus enumeration. Training played a big role in ensuring good quality data from the field. At district level, retraining was done in areas where fieldwork monitors felt that the work was not of the expected quality. A close watch was also kept on individual enumerators, and Fieldwork Supervisors and Fieldwork Coordinators who had problems performing according to expectations were retrained where necessary. Their work was also checked more frequently.
FWS were required to package questionnaires in their EA boxes and hand them over to the FWCs soon after the completion of the EA. The FWCs were required to sign for the receipt of the boxes after verifying the contents of the boxes. They were in turn required to hand over the completed boxes to the DLOs for reverse logistics. DLOs were also required to sign for the receipt of the boxes after verifying the contents of the boxes. The boxes were then stored in designed storage areas awaiting shipping back to the data processing centre in Pretoria.
Progress reporting for data collection was done on a daily basis. Provinces were provided with procedures and timelines for progress reporting and were able to report progress on a daily basis though at the initial stages, there were problems as outlined below.
The Pilot Survey
A pilot survey was conducted in February 2006. The purpose of the pilot was to test all the developed strategies, methodologies, systems, and the questionnaire. A total of 782 EAs were covered in the pilot survey. During the pilot survey the effectiveness of instruments, processes and methods used within the scope of the CS were tested. A range of lessons were learnt which led to the refinement of processes, methods and systems towards the main survey.
Editing
The automated cleaning was implemented based on an editing rules specification defined with reference to the approved questionnaire. Most of the editing rules were categorised into structural edits looking into the relationship between different record type, the minimum processability rules that removed false positive readings or noise, the logical editing that determine the inconsistency between fields of the same statistical unit, and the inferential editing that search similarities across the domain. The edit specifications document for the structural, population, mortality and housing edits was developed by a team of Stats SA subject-matter specialists, demographers, and programmers. The process was successfully carried out during the months of July/August 2007.
Quality assurance was a feature of questionnaire design in the survey, as well as the listing of structures, fieldwork (through extensive training and supervision and regular quality checks in the field). Automated and manual editing were carried out as part of the post-capture process during July/August 2007 to ensure data consistency.
Cautionary note:
The Community Survey results were released on 24 October 2007. After the evaluation of the data by the Stats Council, the Community Survey was found to be comparable in many aspects with other Stats SA surveys, censuses and other external sources. However, there are some areas of concern where Statistics South Africa is urging users to be more cautious when using the Community Survey data.
The main concerns are:
·The institutional population is merely an approximation to 2001 numbers and it is not new data.
·The measure of unemployment in the Community Survey is higher and less reliable due to the differences in questions asked relative to the normal Labour Force Surveys.
·The income includes unreasonably high income for children due to presumably misinterpretation of the question, e.g. listing parent's income for the child.
·The distribution of households by province has very little congruence with the General Household Survey or Census 2001.
·The interpretation of grants or those receiving grants need to be done with caution.
·Since the Community Survey is based on random sample and not a Census, any interpretation should be understood to have some random fluctuation in data, particularly concerning the small population for some cells. The user should understand that the figures are within a certain interval of confidence.
Users should be aware of these statements as part of the cautionary notes:
·The household estimates at municipal level differ slightly from the national and provincial estimates in terms of the household variables profile;
·The Community Survey has considered as an add-on an approximation of population in areas not covered by the survey, such as institutions and recreational areas. This approximation of people could not provide the number of those households (i.e. institutions). Thus, there is no household record for those people approximated as living out of CS scope;
·Any cross-tabulation giving small numbers at municipal level should be interpreted with caution such as taking small value in given table's cell as likely over or under estimation of the true population;
·No reliance should be placed on numbers for variables broken down at municipal level (i.e. age, population group etc.). However, the aggregated total number per municipality provides more reliable estimates;
·Usually a zero total figure (excluding those in institutions) reflects the fact that no sample was realised and in such cases this is likely to be a significant underestimate of the true population.
·As an extension from the above statement, in a number of instances the number realised in the sample, though not zero, was very small (maybe as low as a single individual) and in some cases had to be re-weighted by a very large factor (maximum nearly 800 for housing weight and over 1000 for person weight).
·As a further consequence, small sub-populations are likely to be heavily over- or under-represented at a household level in the data.
·It should be noted that the estimates were done with the use of the de-facto population and not the de-jure population. The final presentation of results is presented on the de-jure population.
Name | URL |
---|---|
DataFirst | http://www.datafirst.uct.ac.za |
The data files from the Community Survey 2007 are public use files, accessible to all. Users may apply or process this data, provided Statistics South Africa (Stats SA) is acknowledged as the original source of the data; that it is specified that the application and/or analysis is the result of the user's independent processing of the data; and that neither the basic data nor any reprocessed version or application thereof may be sold or offered for sale in any form whatsoever without prior permission from Stats SA.
Where a copy of the information is made available to any third party outside the State, the third party must be made aware of the existence of State copyright and ownership of the information by the State. The State (through Statistics South Africa) retains the full ownership of its information, products and services at all times. Access to information does not give ownership of the information to the client. The use of any data is subject to acknowledgement of Statistics South Africa as the supplier and owner of copyright.
Publications based on datasets distributed by DataFirst should acknowledge relevant sources by means of bibliographic citations. To ensure that such source attributions are captured for social science bibliographic utilities, citations must appear in footnotes or in the reference section of publications. The bibliographic citation for this dataset is:
Community Survey 2007. [microdata file]. Pretoria: Statistics South Africa [producer], 2008. Cape Town: DataFirst [distributor], 2010.
The information products and services of Stats SA are protected in terms of the Copyright Act, 1978 (Act 98 of 1978). As the State President is the holder of State copyright, all organs of State enjoy unhindered use of the Department's information products and services, without a need for further permission to copy in terms of that copyright.
Where a copy of the information is made available to any third party outside the State, the third party must be made aware of the existence of State copyright and ownership of the information by the State. The State (through Statistics South Africa) retains the full ownership of its information, products and services at all times. Access to information does not give ownership of the information to the client. The use of any data is subject to acknowledgement of Statistics South Africa as the supplier and owner of copyright.
Statistics South Africa (Stats SA) will not be liable for any damages or losses, except to the extent that such losses or damages are attributable to a breach by Stats SA of its obligations in terms of an existing agreement or to the negligence or wilful act or omissions of Stats SA, its servants or agents, arising out of the supply of data and or digital products in terms of that agreement. The user indemnifies Stats SA against any claims of whatsoever nature (including legal costs) by third parties arising from the reformatting, restructuring, reprocessing and/or addition of the data, by the user.
Since there have been demographic changes in South Africa associated, inter alia, with internal and external migration, and population growth. This means that population profiles may have changed at differing geographic levels. Stats SA is not responsible for any damages or losses, arising directly or consequently, which might result from the application or use of these data.
Copyright 2008, Statistics South Africa.
Name | Affiliation | URL | |
---|---|---|---|
Manager, DataFirst | University of Cape Town | info@data1st.org | http://www.datafirst.uct.ac.za |
DDI_ZAF_2007_PHC_v01_M
Name | Affiliation | Role |
---|---|---|
DataFirst | University of Cape Town | Documentation of Study |
2008
Version 01: Adopted from "ddi-zaf-datafirst-cs-2007-v1.1" DDI that was done by metadata producer mentioned in "Metadata Production" section.
This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser. To learn more about cookies, click here.