SHWALITA, short for ‘Survey of Household Welfare and Labour in Tanzania’, is a 4,000 household survey that randomly assigns different survey modules to its respondents. The survey consists of 3 separate experiments, carefully bundled into one survey:
(i) A consumption experiment in which we developed eight alternative consumption questionnaires which were randomly distributed across 4,000 households. These eight designs vary by method (3 diaries and 5 recall modules), length of reference period in recall modules, and the number of items in the recall modules.
(ii) labour module experiments in which we assess the effect of different ways of collecting labour statistics. It uses two different modules, a long module and a short module, and administers each to either the person him/herself or to someone else in the household answering on their behalf (a proxy respondent). Both proxy respondents and self-reporting respondents are sampled randomly from the roster of household members.
(iii) subjective welfare experiments in which we use an innovative approach to enhance comparability of subjective welfare questions. The technique, developed in political sciences by Gary King, involves the respondent to provide scaled answers on qualitative questions (on a scale of 1 to 5, how do you feel about….). In order to ‘anchor’ the response the respondent is given a ‘vignette’ a short, but powerful story about a fictitious person and is then asked to place this person on the same scale. The placing of the vignette on the same scale allows answers to become more comparable across households, communities and countries.
Kind of data
Sample survey data [ssd]
Edited, anonymous dataset for public distribution.
The Survey of Household Welfare and Labour in Tanzania (SHWALITA) surveyed households in villages and urban areas from seven districts across Tanzania: one district in the regions of Dodoma, Pwani, Dar es Salaam, Manyara, and Shinyanga and two districts in the Kagera Region. The sample was constructed to be representative at the district level, but not at the national level.
Unit of analysis
SHWALITA uses a nationally representative sample of over 4,029 households living in rural and urban areas.
Producers and sponsors
The field work for the Survey of Household Welfare and Labour in Tanzania (SHWALITA) was conducted from September 2007 to August 2008 in villages and urban areas from seven districts across Tanzania: one district in the regions of Dodoma, Pwani, Dar es Salaam, Manyara, and Shinyanga and two districts in the Kagera Region. The districts were purposively selected to capture variations between urban and rural areas as well as across other socio-economic dimensions to inform survey design related to labor statistics and consumption expenditure for low-income settings. The sample was constructed to be representative at the district level, but not at the national level.
Data from the 2002 Census were used to enumerate all villages in the district. In the first stage of the sampling process, a probability-proportional-to-size (PPS) sample of 24 villages was selected per district. In a second stage, a random sub-village (or enumeration area, EA) was chosen within the village through simple random sampling. In the selected EA, all households were listed. From these lists three households were randomly assigned to each of the eight modules. This was done through simple random sampling starting with module 1 and moving to module 8. The five alternative recall questionnaires were conducted in the span of the 14 days the survey team was in the EA to conduct the household diaries. Fortunately refusal and attrition after starting the survey are not an issue. Among the original households selected for the survey and assigned to a module, there were 13 replacements due to refusals. Three households that started a diary were dropped because they did not complete their final interview. This yields a final sample size of 4,029 households.
The data does not come with sample weights.
Dates of collection
Mode of data collection
Computer Assisted Personal Interview [capi]
The Survey of Household Welfare and Labour in Tanzania (SHWALITA) survey experiment entailed fielding eight alternative consumption questionnaires randomly assigned to 4,000 households in Tanzania. The eight designs vary by method of data capture, level of respondent, length of reference period, number of items in the recall list, and nature of the cognitive task required of the respondent. Modules 1-5 are recall designs and modules 6-8 are diaries. These eight designs were strategically selected to reflect the most common methods utilized. The alternative designs focus on variation in the measurement of food consumption and expenditure on select frequently purchased non-food items, and not on general non-food expenditure where, due to the infrequency of purchase, there is a much greater degree of design harmonization in practice. Food consumption includes the quantity consumed from three sources (purchases, home production, and gifts/payments) and, for purchases only, the corresponding value of the quantity (in Tanzania shillings). Modules 1 and 2 seek the consumption values for a long list of 58 commodities. In module 3, the subset list consists of the 17 most important food items that constitute, on average, 77 percent of food consumption expenditure in Tanzania based on the previous Household Budget Survey. When comparing consumption expenditure in module 3 with other modules, we scale up food expenditures for that module (by 1/0.77), as is commonly done in practice. Module 4 is a collapsed list where the 58 food items are aggregated into 11 comprehensive categories. Since respondents often use local units for reporting, the survey teams also established conversion factors for these local units and surveyed local prices so that quantity and value information in diaries could be checked for outliers
Among the recall modules, module 5 deviates from a reporting of actual expenditure over a specified time period. Instead it asks for “usual” consumption, following a recommendation in Deaton and Grosh (2000), where households report the number of months in which the food item is usually consumed by the household, the quantity usually consumed, and usual expenditure in those months. These questions aim to measure permanent rather than transitory living standards, without interviewing the same households repeatedly throughout the year. Hence, module 5 introduces two key differences from the other recall modules: a longer time frame and a different cognitive task required of respondents.
The three diary modules are of the “acquisition type.” Specifically, they add everything that came into the household through harvests, purchases, gifts, and stock reductions and subtract everything that went out of the household through sales, gifts, and stock increases. For example, items that are purchased to be resold, given away, or kept in stock are not counted as consumption. Two modules are household diaries in which a single diary is used to record all household consumption activities. For the third diary module, each adult member keeps their own diary while children were placed on the diaries of the adults who knew most about their daily activities. Just over 52 percent of individuals in the respondent households maintained a personal diary, with the remaining members (typically children) allocated to a specific adult personal diary. Following the literature, we label this as a “personal diary.” The personal diary was carefully designed to avoid double counting. Diary entries are specific to an individual and should leave no scope for double-counting purchases or self-produced goods. It is possible that a “gift” could be given to the household and accidentally recorded by two individuals. However, interviewers were trained to cross-check individual diaries for similar items purchased, produced, or gifted that occur on the same day and to query these during the checks. In many cases, one person will acquire food for the household (such as buying 5 kilograms of rice), which is entered in the diary of the person acquiring the food. So the personal diary is a not an individual’s record of food consumption. Rather, it records the food brought into the household by each member even if for several members to consume (as well as food consumed outside the household). This intensive supervision of the personal diary sample would be impractical for most surveys but these investments were made in order to establish a benchmark for analytic comparisons.
Each of the eight designs varies how food expenditures (including value of home production and consumption) are collected. Non-food items are divided into two groups based on frequency of purchase. Frequently purchased items (charcoal, firewood, kerosene/paraffin, matches, candles, lighters, laundry soap, toilet soap, cigarettes, tobacco, cell phone and internet, and transport) were collected by 14-day recall for modules 1-5 and in the 14-day diary for modules 6-8. Non-frequent non-food items (utilities, durables, clothing, health, education, contributions, and other; housing is excluded) are collected by recall identically across all modules at the end of the interview (and at the end of the two-week period for the diaries) and over the identical one or 12-month reference period, depending on the item in question. Any cross-module differences in measured non-frequent non-food consumption we take as due to spillovers from the different amount of memory training or conditioning of respondents that may be induced by the different food consumption modules.
The Survey of Household Welfare and Labour in Tanzania (SHWALITA) survey experiment used double blind data entry and high levels of quality control to reduce any effect of data entry on estimated cross-module differences. The data entry protocol was the same for all versions of the questionnaire and hence should not be a source of systematic error biasing the comparison of survey module performance.
Before being granted access to the dataset, all users have to formally agree:
1. To make no copies of any files or portions of files to which s/he is granted access except those authorized by the data depositor.
2. Not to use any technique in an attempt to learn the identity of any person, establishment, or sampling unit not identified on public use data files.
3. To hold in strictest confidence the identification of any establishment or individual that may be inadvertently revealed in any documents or discussion, or analysis. Such inadvertent identification revealed in her/his analysis will be immediately brought to the attention of the data depositor.
- Public use files, accessible to all
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the source and date of download
World Bank. Survey of Household Welfare and Labour in Tanzania (SHWALITA) 2007-2008. Public Use Dataset. Downloaded from[URL] on [Date]
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.