National Panel Survey 2019-2020 - Extended Panel with Sex Disaggregated Data
Living Standards Measurement Study [hh/lsms]
The National Panel Survey (NPS) was originally launched in Tanzania in 2008, with support from the Living Standards Measurement Survey – Integrated Surveys on Agriculture [LSMS-ISA ] program at the World Bank and other donors. Four rounds of the NPS have been implemented by the Tanzania National Bureau of Statistics (NBS). The first round of the survey was conducted in 2008/09, the second round in 2010/11, the third round in 2012/13 and the fourth round in 2014/15.
The NPS 2019/20 with sex-disaggregated data (NPS-SDD 2019/20) is an off-shoot survey undertaken by following the entire NPS 2014/15 “Extended Panel” sample. The NPS-SDD 2019/20 is the first Extended Panel with sex-disaggregated data survey, collecting information on a wide range of topics including agricultural production, non-farm income generating activities, individual rights to plots, consumption expenditures, and a wealth of other socioeconomic characteristics.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
This revision includes the addition of nps_sdd.child.anthro.dta and an updated Basic Information Document
Designed for analysis of key indicators at the national level.
The universe includes all households and individuals in Tanzania with the exception of those residing in military barracks or other institutions.
Producers and sponsors
Living Standards Measurement Study Team
The sample design for the NPS-SDD 2019/20 targeted the sub-sample of households from the initial NPS cohort originating in 2008/09 and subsequently surveyed in all four consecutive rounds, considered the “Extended Panel”. This consisted of 989 households from the NPS 2014/15 sample to be tracked and interviewed in the NPS-SDD 2019/20.
It is worth mentioning that the sample design included complete households that could not be interviewed in NPS 2014/15, excluding those households that had refused to be interviewed in NPS 2014/15. This constituted an additional 8 households. Individuals meeting the eligibility requirement that were interviewed as part of the NPS 2012/13, but were not located and interviewed during the NPS 2014/15, were also included in this round if located. Additionally, individuals from NPS 2014/15 who moved into another This constituted an additional 158 individuals assigned to their last known associated household.
The eligibility requirement for inclusion in the NPS is defined as any household member aged 15 years and above, excluding live-in servants. Households with at least one eligible member were completely interviewed, including any non-eligible members present in the household. Any household or eligible members that had either moved or split away from a primary household were tracked and interviewed in their new location.
Additionally, the final sample for NPS-SDD 2019/20 included any resulting split-off households identified during data collection (i.e. a previous NPS member who had moved or started another household). Ultimately, the final sample size for NPS-SDD 2019/20 was 5,587 individuals in 1,184 households.
As with most panel surveys a certain portion of panel respondents are not able to be re-interviewed over time. This attrition of panel respondents can lead to attrition bias where respondents drop out of the survey non-randomly and where the attrition is correlated with variables of interest. The Tanzania NPS has fortunately maintained low attrition over the rounds, thus minimizing the potential for attrition bias within the datasets.
By the end of data collection, 974 of the 989 households had been located and 908 households were successfully re-interviewed for a total household attrition rate of 9.2 percent. At the individual level, 2,621 of the 3,188 eligible household members (over the age of 15 years and not a household servant) were successfully re-interviewed during the NPS-SDD 2019/20, equating to an individual attrition rate of roughly 17.7 percent between the NPS 2014/15 and the NPS-SDD 2019/20 (for extended panel households).
In order to produce nationally representative statistics with the NPS data, it is necessary to apply weighting or expansion factors. The panel survey weights adjust for differences in the probability of selection into the NPS 2008/09 sample for observations in various strata, 2008/09 households splitting into multiple households in NPS 2010/11 and NPS 2012/13, splitting even further in NPS 2014/15, and attrition between rounds of the survey.
The first round of the NPS sample was a multi-stage clustered sample design. First stage sampling involved the selection of survey clusters with the probability of selection proportional to cluster size within a stratum. The sampling of these clusters was stratified along two dimensions: (i) eight administrative zones (seven on Mainland Tanzania plus Zanzibar as an eighth zone), and (ii) rural versus urban clusters within each administrative zone. The combination of these two dimensions yields 16 strata. In rural areas a cluster is defined as an entire village. In urban areas, a cluster is defined as a census enumeration area. As a general rule, the probability of selection was higher for clusters within strata where existing data sources showed that the variance of key variables of interest for the NPS (e.g., household consumption and maize production) were likely to be very high – implying the need for more observations to produce reliable estimates.
The methodology used to calculate the panel weights for the extended panel households in NPS 2019/20 was developed as part of the LSMS-ISA work program. Details on the methodology can be found in the paper: Himelein, Kristen. 2013. “Weight Calculations for Panel Surveys with Subsampling and Split-off Tracking.” Statistics and Public Policy, vol (1), pp40-45
Dates of Data Collection
Data Collection Mode
Computer Assisted Personal Interview [capi]
Tanzania National Bureau of Statistics
Office of the Chief Government Statistician Zanzibar
The NPS-SDD 2019/20 consists of four survey instruments: a Household Questionnaire, Agriculture Questionnaire, Livestock Questionnaire, and a Community Questionnaire.
The Household Questionnaire is comprised of thematic sections. This questionnaire allows for the construction of a full consumption-based welfare measure, permitting distributional and incidence analysis. Data within the household instrument is structured around a household panel survey, and will add additional living standards measure in the form of sex-disaggregated data, this additional level of information will add value in the analysis of intra-household dynamics and revealing a more refined picture of welfare of Tanzania. To protect the confidentiality of respondents, sensitive information has been masked in or removed from the public household data files.
The NPS Extended Panel also includes a robust instrument on household agriculture activities. It offers an essential data source to understand the dynamic role of agriculture to household welfare. Agriculture information is collected at both the plot and crop level on inputs, production and sales, consistent with key phases in the agricultural value chain.
The NPS Extended Panel likewise recognizes the importance of livestock activities to many households. As with the integrated instrument on agriculture, the NPS contains a robust instrument to capture details on these activities. The Livestock Questionnaire is administered to all households participating in these activities and asks about the inputs, outputs, labour, and sales related to these activities. Table 3 provides a more comprehensive list of the sections found within the Livestock Questionnaire.
The Community Questionnaire collects information on physical and economic infrastructure and events in surveyed communities . Responses to the community questionnaire are provided through a group discussion among key informants within the community.
Each of the NPS questionnaires were developed in collaboration with line ministries and donor partners, including the Technical Committee, over a period of several months. The NBS solicited feedback from various stakeholders in regards to survey content and design paying due consideration to comparability with previous panel rounds.
Additional data cleaning was conducted as the final stage of the data processing. Further adjustment of the data post-entry was conducted under the principle of absolute certainty where adjustments must be evidence-based and correction values true beyond a reasonable doubt. As such, the resulting final data files may still contain some inconsistencies and outliers. Handling of these values is thus left entirely to the data user. Throughout the data processing system, versions of the data are archived at all key steps and all checking and cleaning syntax documented and archived.
Estimates of Sampling Error
The sample of households selected in the NPS-SDD 2019/2020 is only one of many samples that could have been selected from the same population. Each alternative sample would yield slightly different from the results of the selected sample. Sampling errors are a measure of the variability between all possible samples and although the degree of variability cannot be directly observed, it can be estimated from the survey results and statistically evaluated. A sampling error can be measured in terms of the standard error for a particular statistic. The computer software program STATA used estat effects to calculate sampling errors for the NPS-SDD 2019/2020. In addition to the standard error, STATA computed the design effect (DEFF) for each estimate, which is defined as the ratio between the standard error using the given sample design and the standard error that would result if a simple random sample had been used. A DEFF value of 1.0 indicates that the sample design is as efficient as a simple random sample, while a value greater than 1.0 indicates the increase in the sampling error is due to the use of a more complex and less statistically efficient (but perhaps more logistically efficient) design. STATA also computed the relative error and confidence limits for the estimates. Sampling errors for the NPS-SDD 2019/2020 are calculated for selected variables considered to be of primary interest at the household and individual levels. For each variable of interest, the value of the statistic (R), its standard error (SE), the number of cases, the design effect (DEFF), the relative standard error (SE/R), and the 95 percent confidence limits (R±2SE) are provided in Tables 1-10 in the BID. The DEFF is considered undefined when the standard error in a simple random sample is zero (when the estimate is close to 0 or 1).
Jonathan G Kastelic
LSMS Data Manager
The Primary Data Investigator undertakes that no attempt will be made to identify any individual person, family, business, enterprise or organization. If such a unique disclosure is made inadvertently, no use will be made of the identity of any person or establishment discovered and full details will be reported to the NBS. The identification will not be revealed to any other person not included in the Data Access Agreement.
The dataset has been anonymized and is available as a Public Use Dataset. It is accessible to all for statistical and research purposes only, under the following terms and conditions:
1. The data and other materials will not be redistributed or sold to other individuals, institutions, or organizations without the written agreement of the National Bureau of Statistics, Tanzania.
2. The data will be used for statistical and scientific research purposes only. They will be used solely for reporting of aggregated information, and not for investigation of specific individuals or organizations.
3. No attempt will be made to re-identify respondents, and no use will be made of the identity of any person or establishment discovered inadvertently. Any such discovery would immediately be reported to the National Bureau of Statistics.
4. No attempt will be made to produce links among datasets provided by the NBS, or among data from the National Bureau of Statistics and other datasets that could identify individuals or organizations.
5. Any books, articles, conference papers, theses, dissertations, reports, or other publications that employ data obtained from the National Bureau of Statistics will cite the source of data in accordance with the Citation Requirement provided with each dataset.
6. An electronic copy of all reports and publications based on the requested data will be sent to the National Bureau of Statistics The original collector of the data, the National Bureau of Statistics, and the relevant funding agencies bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
World Bank. Tanzania National Panel Survey 2019-2020 - Extended Panel with Sex Disaggregated Data (NPS 2019-2020). Ref: TZA_2019_NPS-SDD_v05_M. Downloaded from [uri] on [date]
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
World Bank 2021
DDI Document ID
Development Data Group
Documentation of the study
Date of Metadata Production
DDI Document version
Version 02 (July 2021): Identical to version 01, with an additional panel key data.
Version 03 (September 2021): Consumption aggregates data added.
Version 04 (May 2022): AG_SEC_3B_time and AG_SEC_3A_time data added.
Version 05 (July 2022): nps_sdd.child.anthro.dta data and updated version of the BID added.