COVID-19 High Frequency Phone Survey of Households 2020, Round 1
Socio-Economic/Monitoring Survey [hh/sems]
The main objective of this project is to collect household data for the ongoing assessment and monitoring of the socio-economic impacts of COVID-19 on households and family businesses in Vietnam. The estimated field work and sample size of households in each round is as follows:
Round 1 June fieldwork- approximately 6300 households (at least 1300 minority households)
Round 2 August fieldwork - approximately 4000 households (at least 1000 minority households)
Round 3 September fieldwork- approximately 4000 households (at least 1000 minority households)
Round 4 December- approximately 4000 households (at least 1000 minority households)
Round 5 - pending discussion
Kind of Data
Sample survey data [ssd]
Unit of Analysis
Version 01: Anonymized dataset for public distribution.
The Vietnam COVID-19 High Frequency Phone Survey covered the following topics:
- Household Roster
- Behaviour and Social Distancing
- Access to health services
- Education and childcaring
- Employment of the respondent
- Employment of the respondent's other household members
- Shocks / coping
- Safety nets
- Food Insecurity Experience Scale
Producers and sponsors
Mekong Development Research Institute
Survey collection firm
Partially funded the activity
Department of Foreign Affairs and Trade of Australia
Partially funded the activity
The 2020 Vietnam COVID-19 High Frequency Phone Survey of Households (VHFPS) uses a nationally representative household survey from 2018 as the sampling frame. The 2018 baseline survey includes 46980 households from 3132 communes (about 25% of total communes in Vietnam). In each commune, one EA is randomly selected and then 15 households are randomly selected in each EA for interview. Out of the 15 households, 3 households have information collected on both income and expenditure (large module) as well as many other aspects. The remaining 12 other households have information collected on income, but do not have information collected on expenditure (small module). Therefore, estimation of large module includes 9396 households and are representative at regional and national levels, while the whole sample is representative at the provincial level.
We use the large module of to select the households for official interview of the VHFPS survey and the small module households as reserve for replacement. The sample size of large module has 9396 households, of which, there are 7951 households having phone number (cell phone or line phone).
After data processing, the final sample size is 6,213 households.
The target for Round 1 is to complete interviews for 6300 households, of which 1888 households are located in urban area and 4475 households in rural area. In addition, at least 1300 ethnic minority households are to be interviewed. A random selection of 6300 households was made out of 7951 households for official interview and the rest as for replacement. However, the refusal rate of the survey was about 27 percent, and households from the small module in the same EA were contacted for replacement and these households are also randomly selected.
The main steps for weight adjustment to ensure national and regional representativeness are:
(1) Start with base weights from 2018 survey
(2) HH selection and non-response adjustment using propensity score and probability of selection correction.
(3) Post-stratification: Rescale weights to match national, region, urban/rural populations based on the 2019 population census.
(4) Trim weights
Dates of Data Collection
Data Collection Mode
Computer Assisted Telephone Interview [cati]
Data Collection Notes
About 14% of the sample were interviewed in July rather than June. For those interviewed in June, some questions with an associated reference period of “last month” referred to May 2020. Note that the lockdown was still in effect in early May. For the smaller share of respondents who were interviewed in July, their reference period for these questions was June 2020, which was exclusively post-lockdown. Responses about business activity for these two groups may likely be influenced due to this aspect.
Mekong Development Research Institute
The questionnaire for Round 1 consisted of the following sections
Section 2. Behavior
Section 3. Health
Section 4. Education & Child caring
Section 5A. Employment (main respondent)
Section 5B. Employment (other household member)
Section 6. Coping
Section 7. Safety Nets
Section 8. FIES
Data cleaning began during the data collection process. Inputs for the cleaning process include available interviewers’ note following each question item, interviewers’ note at the end of the tablet form as well as supervisors’ note during monitoring. The data cleaning process was conducted in following steps:
• Append households interviewed in ethnic minority languages with the main dataset interviewed in Vietnamese.
• Remove unnecessary variables which were automatically calculated by SurveyCTO
• Remove household duplicates in the dataset where the same form is submitted more than once.
• Remove observations of households which were not supposed to be interviewed following the identified replacement procedure.
• Format variables as their object type (string, integer, decimal, etc.)
• Read through interviewers’ note and make adjustment accordingly. During interviews, whenever interviewers find it difficult to choose a correct code, they are recommended to choose the most appropriate one and write down respondents’ answer in detail so that the survey management team will justify and make a decision which code is best suitable for such answer.
• Correct data based on supervisors’ note where enumerators entered wrong code.
• Recode answer option “Other, please specify”. This option is usually followed by a blank line allowing enumerators to type or write texts to specify the answer. The data cleaning team checked thoroughly this type of answers to decide whether each answer needed recoding into one of the available categories or just keep the answer originally recorded. In some cases, that answer could be assigned a completely new code if it appeared many times in the survey dataset.
• Examine data accuracy of outlier values, defined as values that lie outside both 5th and 95th percentiles, by listening to interview recordings.
• Final check on matching main dataset with different sections, where information is asked on individual level, are kept in separate data files and in long form.
• Label variables using the full question text.
• Label variable values where necessary.
For the creation of the public data set, additional anonymization processes were conducted
• Create anonymized household and respondent ids. Households can only be identifiable by region and urban/rural location.
• Remove text responses.
• Aggregate low frequency response categories.
• Drop questions that cannot be properly anonymized.
• Remove questions that may contain sensitive information.
World Bank Microdata Library
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
World Bank. Vietnam COVID-19 High Frequency Phone Survey of Households 2020 - Round 1. Dataset downloaded from www.microdata.worldbank.org on [date].
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
DDI Document ID
Development Economics Data Group
The World Bank
Documentation of the DDI
Date of Metadata Production
DDI Document version
Version 01 (November 2020). This metadata contains Round 1 survey information.