The General Household Survey-Panel (GHS-Panel) is implemented in collaboration with the World Bank Living Standards Measurement Study (LSMS) team as part of the Integrated Surveys on Agriculture (ISA) program. The objectives of the GHS-Panel include the development of an innovative model for collecting agricultural data, interinstitutional collaboration, and comprehensive analysis of welfare indicators and socio-economic characteristics. The GHS-Panel is a nationally representative survey of approximately 5,000 households, which are also representative of the six geopolitical zones. The 2018/19 is the fourth round of the survey with prior rounds conducted in 2010/11, 2012/13, and 2015/16. GHS-Panel households were visited twice: first after the planting season (post-planting) between July and September 2018 and second after the harvest season (post-harvest) between January and February 2019.
The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.
Producers and sponsors
National Bureau of Statistics (NBS)
Federal Government of Nigeria
Collaborated in the implementation of the survey
Bill and Melinda Gates Foundation
Funded the study
Federal Government of Nigeria
Funded the study
The original GHS-Panel sample of 5,000 households across 500 enumeration areas (EAs) and was designed to be representative at the national level as well as at the zonal level. The complete sampling information for the GHS-Panel is described in the Basic Information Document for GHS-Panel 2010/2011. However, after a nearly a decade of visiting the same households, a partial refresh of the GHS-Panel sample was implemented in Wave 4.
For the partial refresh of the sample, a new set of 360 EAs were randomly selected which consisted of 60 EAs per zone. The refresh EAs were selected from the same sampling frame as the original GHS-Panel sample in 2010 (the “master frame”). A listing of all households was conducted in the 360 EAs and 10 households were randomly selected in each EA, resulting in a total refresh sample of approximated 3,600 households.
In addition to these 3,600 refresh households, a subsample of the original 5,000 GHS-Panel households from 2010 were selected to be included in the new sample. This “long panel” sample was designed to be nationally representative to enable continued longitudinal analysis for the sample going back to 2010. The long panel sample consisted of 159 EAs systematically selected across the 6 geopolitical Zones. The systematic selection ensured that the distribution of EAs across the 6 Zones (and urban and rural areas within) is proportional to the original GHS-Panel sample. Interviewers attempted to interview all households that originally resided in the 159 EAs and were successfully interviewed in the previous visit in 2016. This includes households that had moved away from their original location in 2010. In all, interviewers attempted to interview 1,507 households from the original panel sample.
The combined sample of refresh and long panel EAs consisted of 519 EAs. The total number of households that were successfully interviewed in both visits was 4,976.
Deviations from sample design
While the combined sample generally maintains both national and Zonal representativeness of the original GHS-Panel sample, the security situation in the North East of Nigeria prevented full coverage of the Zone. Due to security concerns, rural areas of Borno state were fully excluded from the refresh sample and some inaccessible urban areas were also excluded. Security concerns also prevented interviewers from visiting some communities in other parts of the country where conflict events were occurring. Refresh EAs that could not be accessed were replaced with another randomly selected EA in the Zone so as not to compromise the sample size. As a result, the combined sample is representative of areas of Nigeria that were accessible during 2018/19. The sample will not reflect conditions in areas that were undergoing conflict during that period. This compromise was necessary to ensure the safety of interviewers.
Two sets of weights were constructed for two different types of analysis. The first set of weights are those for the combined wave 4 sample. They can be used for cross-sectional analysis for the full GHS-Panel wave 4 sample (refresh plus long panel sample). The second set of weights are designed for longitudinal/panel analysis using the long panel sample only. These longitudinal weights can be used for analysis that seeks to track dynamics within long panel households across the 4 waves of the GHS-Panel. When calculating both weights, only households successfully interviewed in both visits of Wave 4 were considered.
The cross-sectional weights were constructed in three stages:
1. Base weights were calculated according to the inverse probability of selection for each household in the sample. In its simplest form, this weight reflects the two-stage design and thus is the product of the probability that the EA was selected from the frame and the probability that the household was selected within the EA.
2. The base weights were then adjusted for non-response within the EA (the ratio of households successfully interviewed and households selected).
3. The weights were calibrated to reflect the distribution of the underlying population. The weights were calibrated to (1) reflect the total number of households in each Zone in 2010 (i.e. during the first wave of the GHS-Panel) according to population predictions from the 2006 Census and (2) reflect the total number of persons as estimated in the (weighted) Wave 3 sample of the GHS-Panel. The calibration in (1) was performed to maintain consistency with the calibration methodology adopted in previous rounds of the GHS-panel.
4. Lastly, outlier weights were trimmed with a lower bound of 400 and an upper of 50,000.
As of December 4, 2019, the longitudinal weights were still being prepared. Further documentation on their construction will be added here once they are released.
The cross-section weights can be found in the cover page data files for both the post-planting (secta_plantingw4.dta) and post-harvest (secta_harvestw4.dta). The variable name in both data files is wt_wave 4.
Dates of collection
Mode of data collection
Computer Assisted Personal Interview [capi]
The GHS-Panel Wave 4 consists of three questionnaires for each of the two visits. The Household Questionnaire was administered to all households in the sample. The Agriculture Questionnaire was administered to all households engaged in agricultural activities such as crop farming, livestock rearing and other agricultural and related activities. The Community Questionnaire was administered to the community to collect information on the socio-economic indicators of the enumeration areas where the sample households reside.
GHS-Panel Household Questionnaire: The Household Questionnaire provides information on demographics; education; health (including anthropometric measurement for children); labor; food and non-food expenditure; household nonfarm income-generating activities; food security and shocks; safety nets; housing conditions; assets; information and communication technology; and other sources of household income. Household location is geo-referenced in order to be able to later link the GHS-Panel data to other available geographic data sets.
GHS-Panel Agriculture Questionnaire: The Agriculture Questionnaire solicits information on land ownership and use; farm labor; inputs use; GPS land area measurement and coordinates of household plots; agricultural capital; irrigation; crop harvest and utilization; animal holdings and costs; and household fishing activities. Some information is collected at the crop level to allow for detailed analysis for individual crops.
GHS-Panel Community Questionnaire: The Community Questionnaire solicits information on access to infrastructure; community organizations; resource management; changes in the community; key events; community needs, actions and achievements; and local retail price information.
The Household Questionnaire is slightly different for the two visits. Some information was collected only in the post-planting visit, some only in the post-harvest visit, and some in both visits.
The Agriculture Questionnaire collects different information during each visit, but for the same plots and crops.
National Bureau of Statistics
Federal Government of Nigeria
CAPI: For the first time in GHS-Panel, the Wave four exercise was conducted using Computer Assisted Person Interview (CAPI) techniques. All the questionnaires, household, agriculture and community questionnaires were implemented in both the post-planting and post-harvest visits of Wave 4 using the CAPI software, Survey Solutions. The Survey Solutions software was developed and maintained by the Survey Unit within the Development Economics Data Group (DECDG) at the World Bank. Each enumerator was given tablets which they used to conduct the interviews. Overall, implementation of survey using Survey Solutions CAPI was highly successful, as it allowed for timely availability of the data from completed interviews.
DATA COMMUNICATION SYSTEM: The data communication system used in Wave 4 was highly automated. Each field team was given a mobile modem allow for internet connectivity and daily synchronization of their tablet. This ensured that head office in Abuja has access to the data in real-time. Once the interview is completed and uploaded to the server, the data is first reviewed by the Data Editors. The data is also downloaded from the server, and Stata dofile was run on the downloaded data to check for additional errors that were not captured by the Survey Solutions application. An excel error file is generated following the running of the Stata dofile on the raw dataset. Information contained in the excel error files are communicated back to respective field interviewers for action by the interviewers. This action is done on a daily basis throughout the duration of the survey, both in the post-planting and post-harvest.
DATA CLEANING: The data cleaning process was done in three main stages. The first stage was to ensure proper quality control during the fieldwork. This was achieved in part by incorporating validation and consistency checks into the Survey Solutions application used for the data collection and designed to highlight many of the errors that occurred during the fieldwork.
The second stage cleaning involved the use of Data Editors and Data Assistants (Headquarters in Survey Solutions). As indicated above, once the interview is completed and uploaded to the server, the Data Editors review completed interview for inconsistencies and extreme values. Depending on the outcome, they can either approve or reject the case. If rejected, the case goes back to the respective interviewer’s tablet upon synchronization. Special care was taken to see that the households included in the data matched with the selected sample and where there were differences, these were properly assessed and documented. The agriculture data were also checked to ensure that the plots identified in the main sections merged with the plot information identified in the other sections. Additional errors observed were compiled into error reports that were regularly sent to the teams. These errors were then corrected based on re-visits to the household on the instruction of the supervisor. The data that had gone through this first stage of cleaning was then approved by the Data Editor. After the Data Editor’s approval of the interview on Survey Solutions server, the Headquarters also reviews and depending on the outcome, can either reject or approve.
The third stage of cleaning involved a comprehensive review of the final raw data following the first and second stage cleaning. Every variable was examined individually for (1) consistency with other sections and variables, (2) out of range responses, and (3) outliers. However, special care was taken to avoid making strong assumptions when resolving potential errors. Some minor errors remain in the data where the diagnosis and/or solution were unclear to the data cleaning team.
Before being granted access to the dataset, all users have to formally agree:
1. To make no copies of any files or portions of files to which s/he is granted access except those authorized by the data depositor.
2. Not to use any technique in an attempt to learn the identity of any person, establishment, or sampling unit not identified on public use data files.
3. To hold in strictest confidence the identification of any establishment or individual that may be inadvertently revealed in any documents or discussion, or analysis. Such inadvertent identification revealed in her/his analysis will be immediately brought to the attention of the data depositor.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
Nigeria National Bureau of Statistics. General Household Survey, Panel (GHS-Panel) 2018-2019. Dataset downloaded from www.microdata.worldbank.org on [date].
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.