ETH_2020-2023_HFPS_v13_M
High Frequency Phone Survey 2020-2023
HFPS 2020-23
Name | Country code |
---|---|
Ethiopia | ETH |
Socio-Economic/Monitoring Survey [hh/sems]
The World Bank is providing support to countries to help mitigate the spread and impact of the new coronavirus disease (COVID-19) and Beyond. One area of support is for data collection to inform evidence-based policies that may help mitigate the effects of this disease and other economic and social problems. Towards this end, the World Bank is leveraging the Living Standards Measurement Study - Integrated Survey on Agriculture (LSMS-ISA) program to implement high-frequency phone surveys in 5 African countries - Nigeria, Ethiopia, Uganda, Tanzania, and Malawi. This effort is part of a broader first wave of World Bank-supported national longitudinal high-frequency surveys that can be used to help assess the economic and social implications of the COVID-19 pandemic and other socio-economic shocks on households and individuals.
Sample survey data [ssd]
Individual and household
Version 13: Edited, anonymized dataset for public distribution
2023-12-05
This version includes datasets from 18 rounds of the survey.
The Ethiopia - High Frequency Phone Survey covered the following topics:
National coverage - rural and urban
The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.
Name |
---|
World Bank |
Name | Role |
---|---|
Central Statistical Agency | Collaborator |
Name | Abbreviation | Role |
---|---|---|
United States Agency for International Development | USAID | Funded the study |
The World Bank Group | WB | Funded the study |
Global Financing Facility | GFF | Funded the study |
The sample of the HFPS-HH is a subsample of the 2018/19 Ethiopia Socioeconomic Survey (ESS). The ESS is built on a nationally and regionally representative sample of households in Ethiopia. ESS 2018/19 interviewed 6,770 households in urban and rural areas. In the ESS interview, households were asked to provide phone numbers either their own or that of a reference household (i.e. friends or neighbors) so that they can be contacted in the follow-up ESS surveys should they move from their sampled location. At least one valid phone number was obtained for 5,374 households (4,626 owning a phone and 995 with a reference phone number). These households established the sampling frame for the HFPS-HH.
To obtain representative strata at the national, urban, and rural level, the target sample size for the HFPS-HH is 3,300 households; 1,300 in rural and 2,000 households in urban areas. In rural areas, we attempt to call all phone numbers included in the ESS as only 1,413 households owned phones and another 771 households provided reference phone numbers. In urban areas, 3,213 households owned a phone and 224 households provided reference phone numbers. To account for non-response and attrition all the 5,374 households were called in round 1 of the HFPS-HH.
The total number of completed interviews in round one is 3,249 households (978 in rural areas, 2,271 in urban areas).
The total number of completed interviews in round two is 3,107 households (940 in rural areas, 2,167 in urban areas).
The total number of completed interviews in round three is 3,058 households (934 in rural areas, 2,124 in urban areas).
The total number of completed interviews in round four is 2,878 households (838 in rural areas, 2,040 in urban areas).
The total number of completed interviews in round five is 2,770 households (775 in rural areas, 1,995 in urban areas).
The total number of completed interviews in round six is 2,704 households (760 in rural areas, 1,944 in urban areas).
The total number of completed interviews in round seven is 2,537 households (716 in rural areas, 1,1821 in urban areas).
The total number of completed interviews in round eight is 2,222 households (576 in rural areas, 1,646 in urban areas).
The total number of completed interviews in round nine is 2,077 households (553 in rural areas, 1,524 in urban areas).
The total number of completed interviews in round ten is 2,178 households (537 in rural areas, 1,641 in urban areas).
The total number of completed interviews in round eleven is 1,982 households (442 in rural areas, 1,540 in urban areas).
The total number of completed interviews in round twelve is 888 households (204 in rural areas, 684 in urban areas).
The total number of completed interviews in round thirteen is 2,876 households (955 in rural areas, 1,921 in urban areas).
The total number of completed interviews in round fourteen is 2,509 households (765 in rural areas, 1,744 in urban areas).
The total number of completed interviews in round fifteen is 2,521 households (823 in rural areas, 1,698 in urban areas).
The total number of completed interviews in round sixteen is 2,336 households.
The total number of completed interviews in round seventeen is 2,357 households.
The total number of completed interviews in round eighteen is 2,237 households (701 in rural areas, 1,536 in urban areas).
To obtain unbiased estimates from the sample, the information reported by households needs to be adjusted by a sampling weight (or raising factor) w_h. To construct the sampling weights, we follow the steps outlined in Himelein, K. (2014), which outlines eight steps, of which we follow six, to construct the sampling weights for the HFPS-HH:
The survey questionnaires were administered to all the households in the sample. The questionnaires consisted of the following sections:
Baseline (Round 1)
Round 2
Round 3
Round 4
Round 5
Round 6
Round 7
Round 8
Round 9
Round 10
Round 11
Round 12
Round 13
Round 14
Round 15
Round 16
Round 17
Round 18
• Household Identification
• Household Roster Update
• Access to Health Services for Individual Household Members
• Food and Non-food prices
• Economic Sentiments (Sample B)
• Food Insecurity Experience Scale (Sample A)
Start | End | Cycle |
---|---|---|
2020-04-22 | 2020-05-13 | Round 1 |
2020-05-14 | 2020-06-03 | Round 2 |
2020-06-04 | 2020-06-26 | Round 3 |
2020-07-27 | 2020-08-14 | Round 4 |
2020-08-24 | 2020-09-17 | Round 5 |
2020-09-21 | 2020-10-14 | Round 6 |
2020-09-19 | 2020-11-10 | Round 7 |
2020-12-01 | 2020-12-21 | Round 8 |
2020-12-28 | 2021-01-22 | Round 9 |
2021-02-01 | 2021-02-23 | Round 10 |
2021-04-12 | 2021-05-11 | Round 11 |
2021-06-01 | 2021-06-20 | Round 12 |
2022-10-03 | 2020-11-05 | Round 13 |
2022-12-13 | 2023-01-13 | Round 14 |
2023-02-23 | 2023-03-25 | Round 15 |
2023-03-26 | 2023-05-21 | Round 16 |
2023-07-11 | 2023-08-04 | Round 17 |
2023-10-02 | 2023-10-30 | Round 18 |
Name |
---|
Laterite BV |
The Ethiopia- COVID-19 High Frequency Phone Survey of Households (HFPS) was conducted using Computer Assisted Telephone Interview (CATI) techniques. The household questionnaire was implemented using the CATI software, SurveyCTO. Each enumerator was given a tablet which they used to implement the interviews, along with data bundles to be used on their own mobile phone devices.
DATA COMMUNICATION SYSTEM: SurveyCTO's built-in data monitoring functions are used. Each enumerator was provided with a data bundle, allowing for internet connectivity and daily synchronization of their tablet. Data was sent to the server daily. Senior Field Supervisors served as the first step in ensuring data quality. Senior Field Supervisors reviewed the survey with enumerators twice daily via one-on-one calls and were always available to address any concerns that arose while performing an interview. At the same time, a Research Analyst was in charge of checking the uploaded data daily to correct errors and work to prevent them in future surveys. The following data quality checks were completed:
• Daily SurveyCTO monitoring: This included outlier checks, skipped questions, a review of “Other, specify”, other text responses, and enumerator comments. Enumerator comments were used to suggest new response options or to highlight situations where existing options should be used instead. Monitoring also included a review of variable relationship logic checks and checks of the logic of answers. Finally, outliers in phone variables such as survey duration or the percentage of time audio was at a conversational level were monitored. A survey duration of close to 15 minutes and a conversation-level audio percentage of around 40% was considered normal.
• Dashboard review: This included monitoring individual enumerator performance, such as the number of calls logged, duration of calls, percentage of calls responded to and percentage of non-consents. Non-consent reason rates and attempts per household were monitored as well. Duration analysis using R was used to monitor each module's duration and estimate the time required for subsequent rounds. The dashboard was also used to track overall survey completion and preview the results of key questions.
• Daily Data Team reporting: The Field Supervisors and the Data Manager reported daily feedback on call progress, enumerator feedback on the survey, and any suggestions to improve the instrument, such as adding options to multiple choice questions or adjusting translations.
• Audio audits: Audio recordings were captured during the consent portion of the interview for all completed interviews, for the enumerators' side of the conversation only. The recordings were reviewed for any surveys flagged by enumerators as having data quality concerns and for an additional random sample of 2% of respondents. A range of lengths were selected to observe edge cases. Most consent readings took around one minute, with some longer recordings due to questions on the survey or holding for the respondent. All reviewed audio recordings were completed satisfactorily.
• Back-check survey: Field Supervisors made back-check calls to a random sample of 5% of the households that completed a survey in Round 1. Field Supervisors called these households and administered a short survey, including (i) identifying the same respondent; (ii) determining the respondent's position within the household; (iii) confirming that a member of the the data collection team had completed the interview; and (iv) a few questions from the original survey.
DATA CLEANING
At the end of data collection, the raw dataset was cleaned by the Research team. This included formatting, and correcting results based on monitoring issues, enumerator feedback and survey changes. The details are as follows.
Variable naming and labeling:
• Variable names were changed to reflect the lowercase question name in the paper survey copy, and a word or two related to the question.
• Variables were labeled with longer descriptions of their contents and the full question text was stored in Notes for each variable.
• “Other, specify” variables were named similarly to their related question, with “_other” appended to the name.
• Value labels were assigned where relevant, with options shown in English for all variables, unless preloaded from the roster in Amharic.
Variable formatting:
• Variables were formatted as their object type (string, integer, decimal, time, date, or datetime).
• Multi-select variables were saved both in space-separated single-variables and as multiple binary variables showing the yes/no value of each possible response.
• Time and date variables were stored as POSIX timestamp values and formatted to show Gregorian dates.
• Location information was left in separate ID and Name variables, following the format of the incoming roster. IDs were formatted to include only the variable level digits, and not the higher-level prefixes (2-3 digits only.)
• Full Household and Enumeration Area ID variables were given leading 0s to match incoming roster format.
Observation and variable arrangement:
• Only consented surveys were kept in the dataset, and all personal information and internal survey variables were dropped from the clean dataset.
• Roster data is separated from the main data set and kept in long-form but can be merged on the key variable (key can also be used to merge with the raw data).
• In the main dataset, ii4_resp_id and cs7_hhh_id are the roster IDs of the respondent and household head respectively, and can be merged with individual_id in the roster.
• The variables were arranged in the same order as the paper instrument, with observations arranged according to their submission time.
Backcheck data review: Results of the backcheck survey are compared against the originally captured survey results using the bcstats command in Stata. This function delivers a comparison of variables and identifies any discrepancies. Any discrepancies identified are then examined individually to determine if they are within reason.
Use of the dataset must be acknowledged using a citation which would include:
World Bank. Ethiopia - High Frequency Phone Survey 2020-2023. Ref: ETH_2020-2023_HFPS_v13_M. Dataset downloaded from www.microdata.worldbank.org on [date].
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Name | Affiliation | |
---|---|---|
LSMS | The World Bank Group | lsms@worldbank.org |
DDI_ETH_2020-2023_HFPS_v13_M
Name | Abbreviation | Affiliation | Role |
---|---|---|---|
Development Economics Data Group | DECDG | The World Bank | Documentation of the DDI |
2023-08-02
Version 13 (February 2024). This is an update to the Ethiopia High-Frequency Phone Survey documentation with round 18 data and documents.
2024-02-27
This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser. To learn more about cookies, click here.