KEN_2016_HSN-P2_v01_M
Hunger Safety Net Programme Survey 2016
Phase 2
Name | Country code |
---|---|
Kenya | KEN |
Other Household Survey [hh/oth]
The Kenya Hunger Safety Net Programme Phase 2 survey instrument (HSNP2) was developed by Oxford Policy Management Ltd (OPM). for carrying out the impact evaluation of the second phase of the Hunger Safety Net Programme implemented in Mandera, Marsabit, Turkana and Wajir counties in Kenya. The first phase of the HSNP ran from 2009 to 2013. OPM also conducted the evaluation of HSNP phase 1. Data from HSNP Phase 1 Randomised Controlled Trial is available on the World Bank Microdata Catalog.
The HSNP was scaled up under Phase 2 in July 2013. Phase 2 was contracted to run until March 2018. The household survey for HSNP2 consists of three tools:
i) household questionnaire;
ii) business questionnaire; and
iii) livestock trader questionnaire.
The purpose of collecting new data for this evaluation was to gather richer information than was already available through the HSNP2 Management Information System (MIS) data, such as on key outcome areas like poverty and consumption, and to enable an estimate.
Sample survey data [ssd]
Version 2.1: Edited, anonymous dataset for public distribution.
2020-03-13
Household survey: Household characteristics, household listing, livestock ownership and trading, assets and land ownership, household's main dwelling characteristics, food and non-food consumption, agricultural activities, informal and formal transfers, household food security, subjective poverty, saving and borrowing, household jobs and business activities.
Business survey: Type of business and business characteristics (i.e. number of employees, number of hours worked by business owner and employees, value of wages, cost of inputs, revenues and location of economic transactions).
Livestock trader survey: Location of economic transactions, expenditure on taxes, transport, fodder, hired labour, volume of trade, livestock prices.
Topic | Vocabulary |
---|---|
Poverty | World Bank |
Nutrition | World Bank |
Agriculture & Rural Development | World Bank |
Household survey: four counties where the HSNP was implemented, Mandera, Marsabit, Turkana and Wajir.
The business survey data is no representative and was collected from three main commercial hubs of each of the four HSNP counties. Overall, 282 business questionnaires were administered in the four counties.
The livestock trader survey data is no representative and was collected from three main livestock markets of each of the four HSNP counties. Overall, 48 livestock traders have been interviewed.
At the household level, the study population consists of all the households in the four HSNP counties (i.e. Mandera, Marsabit, Turkana and Wajir). Within a household, the survey covered all de jure household members (usual residents).
At the market level, the survey covered a random sample of businesses in the three main commercial hubs of each county. The aim was to capture information on three main sectors of the local economy:
The following categories of businesses were excluded from the listing:
Name |
---|
Oxford Policy Management Limited |
Name | Role |
---|---|
UK Department for International Development | Programme and Evaluation Funder |
Goverment of Kenya | Programme Funder |
Name | Role |
---|---|
Research Guide Africa | Survey partner |
The household survey used a two-stage sampling approach, for which the sample frame was defined by sub-locations and households in the HSNP Management Information System (MIS) data. The MIS data are data from a census of nearly all households in the four HSNP counties. The census contains the information that was gathered in respect of these households during the registration for the HSNP programme, their Proxy Means Test (PMT) score and their assignment to the HSNP cash transfers, as well as information about all payments received by all households since the start of Phase 2. The HSNP acknowledges that a small number of the population was recognised to be missed and was registered at a later date. The sampling procedure was intended to cover the different sample requirements of the impact evaluation approaches, including the Local Economy-Wide Impact Evaluation (LEWIE), the quantitative impact evaluation based on the Regression Discontinuity (RD) approach, and the Propensity Score Matching (PSM) back-up.
Drawing the sample consisted of two stages:
In the first stage, a first stratification was performed, based on sub-counties within each of the four counties. Sub-locations were then selected within each sub-county. The sampling frame, i.e. list of all sub-locations in each county, was obtained using the HSNP MIS data. An explicit stratification of sub-locations was carried out with the aim of identifying sub-locations that can be defined as towns, nearby villages and remote areas. This was guided by the settlement type classification that has already been made by the programme as part of its sub-location mapping exercise.
Sub-locations were selected using the probability proportional to size (PPS) method. This method implies selecting larger EAs, as defined by the household population, with a higher probability. Sub-locations served as primary sampling units (PSUs) in each county. Before drawing the sample of sub-locations, sub-locations that did not have sufficient households in them to make up the minimum sample size required for the analysis, were dropped. A total of 45 sub-locations from the sample frame that had fewer than 14 households with PMT scores above or below the eligibility cut-off were dropped. In addition, six sub-locations per county (that is, 24 in total) were sampled with certainty, which were county capitals and main trading gateways, as these commercial hubs were important to include in the sample for the LEWIE. After also removing from the sample frame 24 sub-locations which were sampled with certainty, this led to a sample frame size of 433 sub-locations from which to select the remainder of the sub-location sample.
For the remaining sub-locations in our sample frame, the PPS process was implemented. This starts by first generating a list of all sub-locations in the sample frame. This list was sorted into groups for each of the four counties, which amounts to implicit stratification by county. A sampling step was then calculated, based on the cumulative sum of population sizes and the number of sub-locations to be drawn. The sampling step is used to select sub-locations from the list, beginning from a random start.
In the second stage, a fixed number of households were randomly selected within each sub-location. 24 households were selected for the purpose of the household quantitative impact evaluation. Eight households were added to the sample for the analysis of the HSNP impact on the local economy (i.e. LEWIE sample). The selection of a fixed number of households in the second stage in theory delivers a sample that is self-weighted (compensating for the oversampling of larger sub-locations in the first PPS stage). In practice, analysis weights are still required also to account for non-response, as outlined further below.
The RD approach required a sample of treatment and control households with a PMT score within a small neighbourhood of the HSNP PMT eligibility threshold. To implement this approach, the bandwidth chosen was a distance of 400 above and below the eligibility cut-off. In each sub-location 32 households were sampled, as follows:
· 12 households below the HSNP eligibility cut-off but within the RD bandwidth;
· 12 households above the cut-off but within the RD bandwidth
· Four households below the RD bandwidth and
· Four households above the RD bandwidth.
In some instances, there were insufficient observations below the RD bandwidth to make up the intended LEWIE sample. When this occurred, the shortage was made up by selecting additional households with a PMT above the RD bandwidth. For instance, if there were only two observations below the RD bandwidth in a sub-location, six households would be sampled from above the RD bandwidth.
The sampling process yielded a sample of 187 sub-locations, including the 24 that were sampled with certainty. 11 sub-locations were sampled twice and one sub-location was sampled three times. 44 sub-locations were selected in Mandera, 46 in Wajir, 48 in Marsabit and 49 in Turkana. In each sub-location 32 households were sampled. In a few sub-locations there were insufficient households to select the desired LEWIE sample, resulting in fewer than 32 households sampled. Overall, 6,384 households were sampled, and of these 5,979 were successfully interviewed.
Within a sub-location, if a selected household could not be surveyed, it was replaced with another household in that sub-location. If some replacement households were selected (due to being unable to identify all initially-sampled households), the number of sampled households in that sub-location increased to a maximum of 48. The households in the replacement pool were further randomised and a sequence of use was determined. Thus, the replacement households were issued in random order. The team supervisor carefully controlled the list of replacement households.
The target population of this research was all the households living in the HSNP counties, comprising both beneficiaries and non-beneficiaries of HSNP cash transfers. The household survey was designed to be representative at the county level.
A business questionnaire was conducted in the three main commercial hubs of each county.
The purpose of the survey was to learn more about local economic activities and livelihoods in the HSNP counties, and the data was used for the LEWIE analysis. The aim was to capture information on three main sectors of the local economy:
· Retailing - shops that sell retail goods on which a price mark-up is applied;
· Services;
· Producers - businesses that transform inputs into outputs.
In each sub-location, a sample of at least seven businesses from each category was targeted. Since no sampling frame for local businesses was available, the survey research teams in each county undertook a listing exercise of all businesses on the main commercial centre of the selected sub-locations. The following categories of businesses were excluded from the listing:
· Temporary stalls or mobile sellers located outside permanent kiosks;
· Banks;
· Education institutions (schools, universities etc.);
· Health facilities.
Once the listing was completed, the team leader sampled the required number of businesses using a step sampling approach. Overall, 282 business questionnaires were administered in the four counties.
The business survey is not representative of any commercial hubs.
Since livestock trading is a very important activity in the HSNP counties, a number of livestock traders have been interviewed to understand better how the market works. In each county, three main livestock markets were targeted for interviews.
Each enumerator team was asked to interview four traders in each of the sub-locations, leading to a total sample size of 12 livestock trader interviews per county. Sampling of livestock traders was mostly done purposively. To the extent possible, team leaders sampled livestock traders in order to achieve a balance between those trading large animals, those trading small or medium value animals, those trading only within the HSNP counties and those who also trade outside the HSNP counties.
The livestock trader survey is not representative of any livestock markets.
Overall, 6,384 households were sampled, and of these 5,979 were successfully interviewed. The overall loss of sample is 6.8% (405 households not interviewed). This is below the 10% level that was estimated as a critical cut-off for an acceptable loss. Moreover, the RD sample size, which was determined based on power calculation and is most relevant, is mostly preserved with the loss for the specific RD sample of just 6%.
All selected 12 commercial hubs were visited, with 282 business questionnaires administered in the four counties. Target sample size of 252 businesses was exceed.
In each county, three main livestock markets were targeted for interviews. All selected 12 livestock markets were visited. Each team was asked to interview four traders in each of market, leading to a total sample size of 12 livestock trader interviews per county. The targeted sample size was achieved in all counties.
The total number of households sampled across treatment and control groups in each sub-location was 24 for the household quantitative impact evaluation sample, if no replacements were needed in the sub-location. If some replacement households were selected (due to being unable to identify all initially-sampled households), the number of sampled households in that sub-location increased to a maximum of 48. The number of households approached is the number of households that were found in the process of trying to reach the 24 households originally sampled in each sub-location (including any who did not consent to interview). If replacements were needed in a sub-location, the number of households approached is the number that were found in the process of trying to reach the originally sampled households as well as the added replacements. The final number of households in our sample corresponds to the sum of the total number of households successfully interviewed in each sub-location forming part of our sample.
Weights were calculated by multiplying the ratio of households originally sampled, or households originally sampled plus any additionally sampled replacements, over the total number of households in the sub-location, by the ratio of the number of households that were successfully interviewed, over the total number of households approached, in the sub-location. This sample-weighted response rate is then inverted to create the response weights. This procedure provides a household non-response rate at the sub-location level whilst also adjusting for the replacement protocol adopted as part of the sampling strategy. Weights have been normalised so that their sum is equal to the number of households in the sample, which is done for the purposes of statistical inference.
In order to take into account sub-location selection, cluster level weights have been computed by multiplying the household weight by the inverted ratio of the number of sampled sub-locations over the total number of sub-locations by county. Cluster level weights are calculated for both the sub-sample including purposefully selected sub-locations and the sub-sample excluding purposefully selected sub-locations.
In addition to household weights, population weights are provided. Population weights are calculated by multiplying the household level weight by the ratio of total households in the HSNP counties, over the households originally sampled.
Survey setting
· The household-level weight in the dataset is 'weight'
· The household-level weight using cluster level weights including purposefully selected sub-location in the dataset is 'weight2'
· The household-level weight using cluster level weights excluding purposefully selected sub-location in the dataset is 'weight3'
· The population-level weight in the dataset is 'popweight'
· The population-level weight for each population sub-group as defined by the RD bandwidth (i.e. households below the HSNP eligibility cut-off but within the RD bandwidth; households above the cut-off but within the RD bandwidth; households below the RD bandwidth and households above the RD bandwidth) in the dataset is 'popweight_multiple'
· The primary sampling unit is the sampled sub-locations. The variable used in the dataset to identify the sub-locations is 'C_4'.
· Stratification during sampling was used at the primary sampling level, i.e., at the sub-location level. For the estimation set-up, strata for sub-locations were defined by county. The strata variable included in the dataset is 'C_1'.
There are no weights for these surveys and samples are not representative at any level.
The household questionnaire was made up of several modules that included questions on: Household characteristics, household listing, livestock ownership and trading, assets and land ownership, household's main dwelling characteristics, food and non-food consumption, agricultural activities, informal and formal transfers, household food security, subjective poverty, saving and borrowing, household jobs and business activities.
The business questionnaire was made up of two modules capturing information about the type of businesses and business characteristics in the three main sectors of the local economy: Retailing - shops that sell retail goods on which a price mark-up is applied; Services; and Producers - businesses that transform inputs into outputs. The questionnaire included questions on number of formal/informal employees, number of hours worked by business owner and employees, value of wages, cost of inputs, revenues and location of economic transactions (i.e. inside or outside the HSNP counties). A separate module was designed for fishmonger businesses, given the seasonality of this activity.
The livestock trader questionnaire was made up of three modules capturing information on the location of economic transactions, and livestock trading (i.e. inside or outside the HSNP counties), traders' expenditure on taxes, transport, fodder, hired labour, volume of trade, as well as livestock prices.
Start | End |
---|---|
2016-02-13 | 2016-06-29 |
Name |
---|
Research Guide Africa |
A rigorous Quality Assurance (QA) process has been established for the HSNP survey, to provide ongoing support to field teams during their assignment and protect the quality of the data.
The first element of the QA approach was careful training and piloting of the survey before implementation. This was essential to ensure that the questionnaires were well designed, and that fieldwork teams were thoroughly prepared to undertake the assignment. Training was conducted between 25 January and 6 February 2016, and a pilot was conducted before the main fieldwork from 9 to 12 February. A pre-test of all survey instruments and the tracking protocol was also conducted between 19 and 26 October 2015.
The second crucial element of the QA approach was to develop a fieldwork model that emphasised close and regular communication between fieldwork teams, and between RGA field staff and OPM. OPM also accompanied RGA fieldwork staff for the initial roll-out of the survey, to support resolution of early challenges faced in implementation of the survey. This communication allowed teams to raise any issues they were facing and seek support early.
In terms of the integrity of the data itself, there were two safeguards in place. The first was a series of basic consistency and range checks that were built into the survey instrument. These checks meant that interviewers would immediately be notified (during the interview) if data that they had entered fell outside an acceptable range or were inconsistent with a previous answer. Second, OPM and RGA teams were able to monitor data on an ongoing basis throughout the fieldwork to identify and respond quickly to any issues as they arose. The ability to closely track quantitative data quality during its collection is an opportunity provided by electronic data collection that is not generally possible with paper-based surveys, where there is a lag in receiving data due to the need to enter them first. A systematic set of cleaning checks have been set up, which each batch of new data was subject to, to check for consistency errors and high rates of anomalous responses. A feedback process was in place to immediately communicate with enumerator teams, if any concerns became apparent.
Research Guide Africa (RGA) conducted the HSNP2 survey.
Fieldwork was undertaken by four field teams composed of between five and seven people each, including the team leader. The size of each field team was determined by the number of interviews to be conducted in each county and the language requirements. Four county team leaders from our survey partner Research Guide Africa's (RGA) headquarters were responsible for supervising ongoing fieldwork, while a fieldwork manager was in charge of managing the overall activities.
The training for the survey took place from between 25 January and 6 February 2016. The main objective of the training was to ensure that team members would be able to master the instruments, understand and correctly implement the fieldwork protocols, and comfortably use CAPI. The training had two components: a classroom-based training component and a field-based component that included a full-scale pilot. The pilot was conducted before the main fieldwork from 9 to 12 February. The two-week training session was intense and combined several different methodologies including PowerPoint presentations, daily assessments, audio-visuals, break-out sessions, plenaries, role plays, mock interviews, and questions and answers.
A full pre-test of all instruments and protocols took place between 19 and 26 October 2015.
Interviews were conducted primarily in English and Swahili, as well as other local languages.
Data collection aimed to maintain the highest possible ethical standards, by seeking the informed consent of all participants in data collection; preserving the anonymity of research participants; ensuring the safety of research participants and protecting the safety of the local researchers who conducted data collection.
Given the data was electronically collected, it was continually checked, edited and processed throughout the survey cycle.
A first stage of data checking was done by the survey team which involved (i) checking of all IDs; (ii) checking for missing observations; (iii) checking for missing item responses where none should be missing; and (iv) first round of checks for inadmissible/out of range and inconsistent values. Additional data processing activities were performed at the end of data collection in order to transform the collected cleaned data into a format that is ready for analysis. The aim of these activities was to produce reliable, consistent and fully-documented datasets that can be analysed throughout the survey and archived at the end in such a way that they can be used by other data users well into the future. Data processing activities involved:
The datasets were then sent to the analysis team where they were subjected to a second set of checking and cleaning activities. This included checking for out of range responses and inadmissible values not captured by the filters built into the CAPI software or the initial data checking process by the survey team.
A comprehensive data checking and analysis system was created including a logical folder structure, the development of template syntax files (in Stata), to ensure data checking and cleaning activities were recorded, that all analysts used the same file and variable naming conventions, variable definitions, disaggregation variables and weighted estimates appropriately.
Name | Affiliation | |
---|---|---|
Fred Merttens | Oxford Policy Management Ltd. | fred.merttens@opml.co.uk |
The datasets have been anonymised and are available as a Public Use Dataset. They are accessible to all for statistical and research purposes only, under the following terms and conditions:
The original collector of the data and the funding agencies bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Oxford Policy Management Limited. Kenya Hunger Safety Net Programme Phase 2 (HSNP2) Survey, 2016, Version 2.1 of the public use dataset (March 2020). Ref: Ken_2016_HSN-P2_v01_M. Downloaded from [url] on [date].
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
(c) 2020, Oxford Policy Management Limited
Name | Affiliation | |
---|---|---|
Fred Merttens | Oxford Policy Management Ltd | fred.merttens@opml.co.uk |
Virginia Barberis | Oxford Policy Management Ltd | virginia.barberis@opml.co.uk |
DDI_KEN_2016_HSN-P2_v01_M
Name | Affiliation | Role |
---|---|---|
Barberis, Virginia | Oxford Policy Management Ltd. | Data analyst |
2020-03-16
Version 1 (March 2020).
This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser. To learn more about cookies, click here.