The Cape Area Panel Study (CAPS) is a longitudinal study of the lives of youths and young adults in metropolitan Cape Town, South Africa. The first wave of the study collected interviews from about 4800 randomly selected young people age 14-22 in August-December, 2002. Wave 1 also collected information on all members of these young people’s households, as well as a random sample of households that did not have members age 14-22. A third of the youth sample was re-interviewed in 2003 (Wave 2a) and the remaining two thirds were re-visited in 2004 (Wave 2b). The full youth sample was then re-interviewed in 2005 (Wave 3), 2006 (Wave 4) and 2009 (Wave 5). Wave 3 includes interviews with approximately 2000 co-resident parents of young adults, while wave 4 also includes interviews with a sample of older adults (all individuals from the original 2002 households who were born on or before 1 January 1956) and all children born to the female young adults. The fifth wave comprises all respondents interviewed in any of the Waves 2a, 3 or 4. In 2010 there were telephonic follow-ups or proxy interviewed that tried to capture those that were not successfully interviewed during the course of the 2009 fieldwork. The study covers a wide range of outcomes, including schooling, employment, health, family formation, and intergenerational support systems.
CAPS began in 2002 as a collaborative project of the Population Studies Center in the Institute for Social Research at the University of Michigan and the Centre for Social Science Research at the University of Cape Town (UCT). Other units involved in subsequent waves include UCT’s Southern African Labour and Development Research Unit and the Research Program in Development Studies at Princeton University. Primary funding is provided by the National Institute of Child Health and Human Development of the U.S. National Institutes of Health (NIH). Additional funding has been provided by the Office of AIDS Research, the Fogarty International Center, and the National Institute of Aging of NIH, and by grants from the Andrew W. Mellon Foundation to the University of Michigan and the University of Cape Town.
Kind of data
Sample survey data [ssd]
Version 1 (October 2012): Edited, anonymised dataset for public distribution
LABOUR AND EMPLOYMENT 
HOUSING AND LAND USE PLANNING 
TRANSPORT, TRAVEL AND MOBILITY 
DEMOGRAPHY AND POPULATION 
SOCIAL WELFARE POLICY AND SYSTEMS 
Metropolitan Cape Town
The data is at Magisterial District level
Unit of analysis
The unit of analysis for this survey includes households and individuals.
The survey covered youths and young adults in Metropolitan Cape Town, South Africa.
Producers and sponsors
University of Michigan and University of Cape Town
National Institute of Child Health and Human Development (United States National Institutes of Health)
Andrew W. Mellon Foundation
Fogarty International Center
National Institute of Aging of NIH
The Office of AIDS Research
The CAPS household sample was drawn through a two-stage process. First, the 'enumeration areas' (EAs) used for the 1996 Population Census were divided into three strata according to whether the population of each was predominantly African, predominantly coloured or predominantly white. A sample of primary sampling units (PSUs) was selected within each stratum with probability proportional to size. Within each PSU a sample of 25 screener households was drawn. The Overview and Technical Documentation for Waves 1-2-3-4-5 provides a more detailed discussion of the sampling design. Data users should take the stratification and clustering into account for all analyses. Strata and PSUs are identified by the majpop and cluster variables respectively.
The public release data include sample weights that should be used to adjust for the sample design. Three sample weights for wave 1 are included in the data, each of which deal with specific issues.
The first of these weights, "weightsd", adjusts for three critical elements of the sample design: 1) the intentional oversampling of African and white households; 2) the intentional differential sampling of households with and without young adult household members; and 3) the addition of secondary households (backyard shacks) into the sample of screener households in the field. This weight is incorporated into the other two sample weights. The second, weighthr, begins from the first weight and adds additional adjustments for unit non-response at the level of PSUs. The third sample weight, weightyr, is an individual young adult weight that adds additional adjustment for individual non-response. This adjustment is made by calculating response rates for each combination of single years of age, sex, and population group (8x2x3=48 cells) using the information provided on the household questionnaire.
In addition to the three sample design weights, the Waves 1-2-3-4-5 public release data sets include additional weights to adjust for individual young adult non-response in Waves 2, 3, 4 and 5. Since Wave 2 is composed of two sub-waves (Waves 2a & 2b) with different modules asked of different sub-samples, there are three Wave 2 attrition weights. The weight w2a_weightyr corresponds to the Wave 2a sub-sample (approximately one-third of the total CAPS Young adult sample), the weight w2b_weightyr corresponds to the Wave 2b sub-sample (approximately two-thirds of the total CAPS Young adult sample), and the weight w2y_weightyr corresponds to the combined "total" Wave 2 sample. All of these weights are individual young adult weights that add an additional adjustment for individual young adult non-response in Wave 2a, 2b or 2 "total" to the weight weightyr, which adjusts for the sample design and Wave 1 non-response.
Similarly the weights, w3y_weightyr, w4y_weightyr and w5y_weightyr, are individual young adult weights that add additional adjustment for individual young adult non-response in Waves 3, 4 and 5 to the weight weightyr. The adjustment for Wave 2a, 2b, Wave 2 "total", Wave 3, Wave 4 or Wave 5 young adult non-response is made by estimating separate probit models of the probability the respondent completed a Wave 2a, 2b, either of the Wave 2, Wave 3, Wave 4 or Wave 5 young adult questionnaire. Information given in Wave 1 on age, sex and population group was included in the model. As in the construction of the original weight weightyr, the small number of individuals classified as Indian and other were merged with the Coloured group. From the estimation, the predicted probability was inverted and then capped at the 99% percentile to obtain the non-response adjustment.
Dates of collection
Mode of data collection
• Wave 1 (2002) included a household questionnaire, a young adult questionnaire and a literacy and numeracy evaluation questionnaire
• Wave 2a (2003-2004) and 2b (2004) both included young adult questionnaires only
• Wave 3 (2005) included a household questionnaire, a parent questionnaire and a young adult questionnaire
• Wave 4 (2006) included a household questionnaire, an older adult questionnaire, a young adult questionnaire, a young adult proxy questionnaire and a child questionnaire
• Wave 5 (2009) included a young adult questionnaire, young adult telephonic questionnaire and a young adult proxy questionnaire
University of Cape Town
Public use files, accessible to all.
Papers using the CAPS Waves 1-2-3-4-5 data should include the following acknowledgement:
The Cape Area Panel Study Waves 1-2-3 were collected between 2002 and 2005 by the University of Cape Town and the University of Michigan, with funding provided by the US National Institute for Child Health and Human Development and the Andrew W. Mellon Foundation. Wave 4 was collected in 2006 by the University of Cape Town, University of Michigan and Princeton University. Major funding for Wave 4 was provided by the National Institute on Aging through a grant to Princeton University, in addition to funding provided by NICHD through the University of Michigan. Wave 5 was collected in 2009 by the University of Cape Town. Major funding for Wave 5 was provided by the Health Economics & HIV/AIDS Research Division (HEARD) at the University of KwaZulu-Natal, with additional funding from the Andrew W. Mellon Foundation (through the CSSR at UCT), the European Union (through the Microcon research partnership on the microfoundations of violent conflict, via the CSSR) and the NICHD (through the University of Michigan).
By registering to use the CAPS data, you agree to the following conditions: You will not attempt to identify specific individuals in the CAPS data. You will not redistribute the data to other users - all users should register on the CAPS web site. You will include the recommended citation (as specified below) in any paper, thesis, dissertation, or other publication that uses the CAPS data. In the text, it should be cited as: The CAPS (Cape Area Panel Study) is produced and distributed by the universities of Michigan and Cape Town, with funding from the National Institutes of Health and the Andrew W. Mellon Foundation. You will provide CAPS with an electronic copy of any paper, thesis, dissertation, or other publication that uses CAPS data. These should be sent to Lynn Woolfrey, Data First, University of Cape Town, Private Bag, Rondebosch, Cape Town 7701, South Africa; email firstname.lastname@example.org - or to one or other of the CAPS directors. You will notify CAPS staff regarding errors in the data or any features of the data that could compromise respondent confidentiality.
University of Michigan and University of Cape Town. Cape Area Panel Study 2002-2009, Waves 1-5. 2012 [dataset]. Version 1210. Cape Town and Ann Arbor: University of Cape Town and University of Michigan [producers], 2012. Cape Town: DataFirst [distributor], 2012.
University of Cape Town
World Bank Microdata Library
The World Bank
University of Cape Town
University of Cape Town
Version 2 (May 2014)
Adaptation of Version1 (October 2012) zaf-datafirst-caps1-2-3-4-5-v1