The STEP project consists of Household Surveys collection and Employer Surveys collection.
These surveys are part of the STEP Household Surveys collection.
So far, two waves have been implemented in 12 countries. The third wave is under preparation.
The first wave started in September 2011 and was completed in December 2013. Wave 1 countries are: Bolivia, Colombia, Sri Lanka, Lao PDR, Vietnam, the Yunnan Province in China, Ghana, and Ukraine.
The second wave started in August 2012 and was completed in June 2014. Wave 2 countries are: Armenia, Georgia, Macedonia, and Kenya.
The STEP (Skills Toward Employment and Productivity) Measurement program is the first ever initiative to generate internationally comparable data on skills available in developing countries. The program implements standardized surveys to gather information on the supply and distribution of skills and the demand for skills in labor market of low-income countries.
The uniquely-designed Household Survey includes modules that measure the cognitive skills (reading, writing and numeracy), socio-emotional skills (personality, behavior and preferences) and job-specific skills (subset of transversal skills with direct job relevance) of a representative sample of adults aged 15 to 64 living in urban areas, whether they work or not. The cognitive skills module also incorporates a direct assessment of reading literacy based on the Survey of Adults Skills instruments. Modules also gather information about family, health and language.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
The units of analysis are the individual respondents and households. A household roster is undertaken at the start of the survey and the individual respondent is randomly selected among all household members aged 15 to 64 included. The random selection process was designed by the STEP team and compliance with the procedure is carefully monitored during fieldwork.
Version 02, edited anonymous datasets for public distribution.
Version 01 was published in June 2014, but is now replaced with v02.
The difference between v02 and v01 datasets:
- Changes made to all STEP countries:
1) The literacy variables had incorrect labelling, which has now been fixed
2) The 'emp' variable has been cleaned
3) The 'write_dif' variable has been corrected
4) All monetary variables (identifiable by '_usd') have been converted to PPP dollars
The scope of the study includes:
- household demographic characteristics
- dwelling characteristics
- education and training
- job skill requirements
- personality, behavior and preferences
- language and family background
- reading literacy test assessment
The survey covered the following regions: Western, Central, Greater Accra, Volta, Eastern, Ashanti, Brong Ahafo, Northern, Upper East and Upper West.
- Areas are classified as urban based on each country's official definition.
The target population for the Ghana STEP survey comprises all non-institutionalized persons 15 to 64 years of age (inclusive) living in private dwellings in urban areas of the country at the time of data collection. This includes all residents except foreign diplomats and non-nationals working for international organizations.
Exclusions : Military barracks were excluded from the Ghana target population.
Producers and sponsors
STEP Co-Task Team Leader, Education Global Practice
Maria Laura Sanchez Puerta
STEP Co-Task Team Leader, Social Protection and Labor Global Practice
World Bank Consultant Project Coordinator
Technical assistance in project management, data collection, data processing and data analysis
World Bank Consultant Senior Labor Economist
Technical assistance in project management, questionnaire design, and data analysis
World Bank Consultant Survey Consultant
Technical assistance in questionnaire design, sampling methodology, and data collection
Sebastian Monroy Taborda
World Bank Consultant Research Analyst
Technical assistance in data processing and data analysis
Multi-Donor Trust Fund Labor Markets, Job Creation and Economic Growth
Bank Netherlands Partnership Program
Educational Testing Services
Designed the Reading Literacy Assessment Module and conducted the preliminary analysis of the reading literacy data, including generating plausible values for the Extended Assessment
The Ghana sample design is a four-stage sample design. There was no explicit stratification but the sample was implicitly stratified by Region. [Note: Implicit stratification was achieved by sorting the PSUs (i.e., EACode) by RegnCode and selecting a systematic sample of PSUs.]
First Stage Sample
The primary sample unit (PSU) was a Census Enumeration Area (EA). Each PSU was uniquely defined by the sample frame variables Regncode, and EAcode. The sample frame was sorted by RegnCode to implicitly stratify the sample frame PSUs by region. The sampling objective was to select 250 PSUs, comprised of 200 Initial PSUs and 50 Reserve PSUs. Although 250 PSUs were selected, only 201 PSUs were activated. The PSUs were selected using a systematic probability proportional to size (PPS) sampling method, where the measure of size was the population size (i.e., EAPopn) in a PSU.
Second Stage Sample
The second stage sample unit is a PSU partition. It was considered necessary to partition 'large' PSUs into smaller areas to facilitate the listing process. After the partitioning of the PSUs, the survey firm randomly selected one partition. The selected partition was fully listed for subsequent enumeration in accordance with the field procedures.
Third Stage Sample
The third stage sample unit (SSU) is a household. The sampling objective was to obtain interviews at 15 households within each selected PSU. The households were selected in each PSU using a systematic random method.
Fourth Stage Sample
The fourth stage sample unit was an individual aged 15-64 (inclusive). The sampling objective was to select one individual with equal probability from each selected household.
The Ghana firm's sampling objective was to obtain interviews from 3000 individuals in the urban areas of the country. In order to provide sufficient sample to allow for a worst case scenario of a 50% response rate the number of sampled cases was doubled in each selected PSU.
Although 50 extra PSUs were selected for use in case it was impossible to conduct any interviews in one or more initially selected PSUs only one reserve PSU was activated. Therefore, the Ghana firm conducted the STEP data collection in a total of 201 PSUs.
Sampling methodologies are described for each country in two documents:
(i) The National Survey Design Planning Report (NSDPR)
(ii) The weighting documentation
An overall response rate of 83.2% was achieved in the Ghana STEP Survey. Table 20 of the weighting documentation provides the detailed percentage distribution by final status code.
While the Ghana four-stage stratified cluster design greatly enhanced the operational feasibility of data collection, it resulted in differential probabilities of selection for the selected persons. Consequently, each selected person in the survey does not necessarily represent the same number of persons in the target population. To account for differential probabilities of selection due to the nature of the design and to ensure accurate survey estimates, STEP requires a sampling weight for each person that participated in the survey.
In general, the objectives of the STEP weighting are to construct a set of survey weights to:
1) Compensate for unequal probabilities of selection
2) Compensate for household-level non-response and person-level non-response;
3) Adjust the weighted sample distribution for key variables of interest (for example, age, gender, education) so that it conforms to a known population distribution for these variables.
The general weighting procedure for the Ghana STEP survey required the following tasks.
1) Creation of a data file to input into the weighting process
2) World Bank (WB) Weight Requirement: Create survey weights for sampled cases of households and persons that provided sufficient data to be considered a participant in the survey. This requirement does not necessarily include the completion of an assessment General Booklet, nor does it necessarily include the completion of all household and individual questionnaire modules.
a) Calculation of a PSU weight for 201 activated PSUs;
b) Calculation of a household weight for each sampled household; i) Calculation of a household-level non-response adjustment independently for each PSU.
c) Calculation of a person weight for each selected person (SP); i) Calculation of a non-response adjustment independently for each sampled person.
3) The required output from the weighting process is a final Ghana data file with the survey design weights (i.e., for each sampled PSU, household, person) appended to each data record.
Dates of Data Collection
Data Collection Mode
Team Supervisors - Each interviewer team will report to a Team Supervisor
Team Supervisors' responsibilities include:
- Coordinating fieldwork in each assigned PSU
- Full-time work with the interviewer team and on-going monitoring of each interviewer's work
- Documenting non-response, activation of reserves, problems encountered
- Assigning literacy booklets
- Communicating regularly with the Field Manager
- Selecting households to be interviewed following procedures outlined in the Technical Standards (if selection will be done in Headquarters, please specify)
Quality control by Team Supervisors:
- At least one meeting per week with each interviewer to discuss progress and/or problems
- Random spot visits during interviewers' work to observe household and individual interviews. For each interview observed, Team Supervisors will fill out the Interview Evaluation Form (Appendix 5)
- Check each accepted questionnaire for completeness and accuracy, and fill out Visual Scrutiny Form for each questionnaire (Appendix 7)
- Submit household listings and sample selections to the Project Manager
- Follow-up of non-response households/ individuals according to the table in Appendix 6 which details the revisits required for each situation and whether a reserve household should be activated
Visit verification and selection of individual respondent verification:
- The Supervisor or Field Manager (or assistants) will revisit 15% of each interviewer's finalized cases.
- In the event that a respondent is not available during the initial follow-up visit, a telephone follow-up may be carried out for no more than one third of the revisits.
- The households to revisit will be selected randomly by the Field Manager.
- During each revisit, the Supervisor will complete a Check up Visit form (Appendix 8).
- The Fieldwork Manager should participate with the Team Supervisor in some of these revisits, unannounced, with households chosen by the Field Supervisor, in order to check on the Team Supervisors.
- The STEP Consortium may also ask to attend verification revisits, and randomly choose the Households to revisit.
Field Supervision details are laid out in point #5 of the Fieldwork section 2.6 (p28) of the NSDPR provided as an external resource.
Data Collection Notes
1. Each component of the STEP Survey was carried out by a personal visit using a Paper And Pencil Interview (PAPI) method.
2. As the STEP program requires all surveys to be implemented in a standardized way, particular attention was provided to implementation processes
(i) Each participating country (survey firm) wrote up a National Survey Design Planning Report (NSDPR) detailing how it intended to implement the STEP survey while complying with the STEP Technical Standards. The NSDPRs were submitted to the WB STEP team for approval.
(ii) The WB STEP team and Eductaional Testing Services (ETS) provided 2 workshops to all survey firms. The first was a 2-day workshop provided via video conference and aimed at presenting the STEP Technical Standards. The second workshop was organized over 2 full weeks at the WB's Headquarters and consisted in a training course to project managers from each survey firm on the survey instruments - Background Questionnaire and Reading Literacy Assessment - as well as on implementation and data management procedures.
(iii) Based on the STEP Technical Standards, the survey firms adapted and translated the STEP survey instruments, the Interviewer Manual, and all training materials.
(iv) Once the instruments had been adapted and translated, survey firms carried out a pre-test, usually including 20-30 interviews. Findings from the pre-test were discussed with the WB STEP team and ETS to finalize the adaptation and translation of the STEP survey instruments.The survey was implemented in English.
(v) Each survey firm provided a 2-week training course to its enumerators, using training materials developed by the WB STEP team (after translation and adaptation). The WB STEP team's Survey Consultant helped organize the training and was present in the country for the first few days at least of the training. In addition, the WB STEP team in Washington DC provided just-in-time technical assistance, answering questions sent by the survey firm during the training. The training included in-field mock interviews in addition to in-class courses. At the end of the training, survey firms only retained enumerators having demonstrated a good understanding of the instruments.
(vi) As per STEP Technical Standards, data collection started within a few days of the end of the enumerators' training course.The composition of each country's fieldwork teams is described in the NSDPR, as well as reporting procedures and quality control processes.Weekly reports were sent to the WB STEP team, which provided just-in-time technical assistance during fieldwork to answer questions or concerns. Regular calls or VCs were also held between survey firms and the WB STEP team to discuss progress. Matters discussed usually involved questions on how to deal with specific situations, strategies to reduce non-response, the activation of reserve households, and general pace of progress. Non-response rates were high in Bolivia and Colombia, in part due to difficult access to appartment buildings and gated communities, although survey firms worked hard to gain local community leaders' support. In a few instances - all documented in the weighting documentation - a couple of EAs were replaced due to security concerns or because an EA had been completely altered (e.g. construction site, dwellings converted into a large shopping center).
(vii) Interviews lasted between 120 and 150 minutes, depending on respondents' reading proficiency.
Detailed information is provided in the National Survey Design Planning Report (NSDPR). It described the project management structure, fieldwork teams and reporting processes.
Ghana Institute of Statistical, Social and Economic Research
The STEP survey instruments include:
(i) a Background Questionnaire developed by the WB STEP team
(ii) a Reading Literacy Assessment developed by Educational Testing Services (ETS).
All countries adapted and translated both instruments following the STEP Technical Standards: 2 independent translators adapted and translated the Background Questionnaire and Reading Literacy Assessment, while reconciliation was carried out by a third translator.
The WB STEP team and ETS collaborated closely with the survey firms during the process and reviewed the adaptation and translation (using a back translation). In the case of Ghana, no translation was necessary, but the adaptation process ensured that the English used in the Background Questionnaire and Reading Literacy Assessment closely reflected local use.
- The survey instruments were both piloted as part of the survey pretest.
- The adapted Background Questionnaires are provided in English as external resources. The Reading Literacy Assessment is protected by copyright and will not be published.
STEP Data Management Process
1. Raw data is sent by the survey firm
2. The WB STEP team runs data checks on the Background Questionnaire data.
- ETS runs data checks on the Reading Literacy Assessment data.
- Comments and questions are sent back to the survey firm.
3. The survey firm reviews comments and questions. When a data entry error is identified, the survey firm corrects the data.
4. The WB STEP team and ETS check the data files are clean. This might require additional iterations with the survey firm.
5. Once the data has been checked and cleaned, the WB STEP team computes the weights. Weights are computed by the STEP team to ensure consistency across sampling methodologies.
6. ETS scales the Reading Literacy Assessment data.
7. The WB STEP team merges the Background Questionnaire data with the Reading Literacy Assessment data and computes derived variables.
Detailed information data processing in STEP surveys is provided in the 'Guidelines for STEP Data Entry Programs' document provided as an external resource. The template do-file used by the STEP team to check the raw background questionnaire data is provided as an external resource.
Data entry processes, including team composition, are described in the NSDPR. In most countries, data entry took place at the survey firm's headquarters.
1. Background Questionnaire Data
For the Background Questionnaire data, survey firms could use the WB STEP Data Entry Program (DEP) or design their own. In the latter case, the WB STEP team checked their DEP to ensure it complied with STEP Technical Standards.The STEP DEP was developed in Excel and mirrored the Background Questionnaire. Ghana used the STEP DEP.
(i) Countries which used the STEP DEP
- Yunnan Province of China
(ii) Countries which developed their own DEP in CSPro
- Lao PDR
- Sri Lanka
Standards for Data Entry are detailed in the 'Guidelines for STEP Data Entry Programs' and summarized in the NSDPR. Double data entry process was required. All range checks and skips were controlled by the program. Consistency checks were also included in the data entry program.
2. Reading Literacy Assessment Data
All survey firms were required to score the Reading Literacy Assessment booklets and to enter the data using the Data Entry Program developed by ETS. Double data entry process was required. Consistency checks were also included in the data entry program.
Estimates of Sampling Error
A weighting documentation was prepared for each participating country and provides some information on sampling errors.
The weighting documentation is provided as an external resource.
STEP Skills Measurement Program, Household Survey 2014, The World Bank. Dataset downloaded from [URL] on [date]
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
(c) STEP 2014, The World Bank
DDI Document ID
Development Economics Data Group
The World Bank
Documentation of the DDI
Date of Metadata Production
DDI Document version
Version 02 (March 2016)
Changes in v02 of study documentation compared to v01 published in June 2014
- v01 datasets were replaced with v02
- Study Title, Series Information and Abstract were edited
Version 01 (June 2014)