Survey of Living Standards 2007 and Extension 2008
Living Standards Measurement Study [hh/lsms]
The 2007 Timor-Leste Survey of Living Standards (TLSLS 2) is the second national survey of living standards for Timor-Leste. The first national survey, the Timor-Leste Living Standards Survey (TLSS), was undertaken in 2001 during the months of August to November. The 2001 TLSS had a modest though nationally representative sample of 1800 households from 100 sucos. Being the first national living standards survey of its kind following the independence referendum of August 1999, the TLSS provided a wealth of information on living conditions in the country as an input into the first National Development Plan. The second national living standards survey, the TLSLS, has been undertaken to update this information and is also expected to provide an input into the development of the second National Development Plan.
The TLSLS2-X extension survey was designed to re-visit one third of the households interviewed under the TLSLS2 2007-08 to explore different facets of household welfare and behavior in the country, while also being able to make use of information collected in the TLSLS2 survey for analytic purposes.
In 2007-2008 a multi-topic household survey, the Timor Leste Survey of Living Standards (TLSLS2) was conducted in East Timor with the main objectives of developing a system of poverty monitoring and supporting poverty reduction, and to monitor human development indicators and progress toward the Millennium Development Goals. Information collected in the TLSLS2 questionnaire included: household information, housing, access to facilities, expenditures/consumption, education, health, fertility and maternity history, employment, farming and livestock, transfers, borrowing and saving, other income, social capital, subjective well-being, AIDs and anthropometrics.
The TLSLS2-X extension survey was designed to re-visit one third of the households interviewed under the TLSLS2 2007-08 to explore different facets of household welfare and behavior in the country, while also being able to make use of information collected in the TLSLS2 survey for analytic purposes.
The four new topics investigated in the extension survey are:
- Risk and Vulnerability: This section is designed to help us understand the dimensions and sources of household-level vulnerability to uninsured risks in Timor Leste, and the efficacy and welfare effects of various risk-management strategies (prevention, mitigation, coping) and mechanisms (private as well as public, formal as well as informal) households do (or do not) have access to. The work in Timor Leste is part of a program of analytic work and policy dialogue throughout the EAP region, more information on which can be found on the World Bank website.
- Land Degradation and Poverty: This section of the questionnaire is designed to identify proximate causes of deforestation through land use patterns and links with poverty; understand strengths and failures of common land resource management institutions (property rights, enforcement); understand the impact of the Siam Weed problem on household welfare.
- Justice for Poor: The Justice for the Poor/Access to Justice (J4P/A2J) module of the survey will serve mainly as an initial diagnostic for project development in the country. The topics we would be interested in covering would be Dispute Processing/Resolution; Social Legal Norms and Perceptions of Efficiency in Government (Local, Sub-District, District and National level).
- Access to Financial Services: The financial service work has the following two objectives: (i) to collect data on access to and use financial services (savings and credit), both formal and informal, and (ii) assess the quality of information on access to financial services obtained from head of households vs. from all adults - i.e. is there a bias introduced by not asking all household members, do the characteristics of the head or the household affect this (gender, age, nuclear family, urban, education levels, wealth, etc.).
Kind of Data
Sample survey data [ssd]
Unit of Analysis
- Household level data: Housing, Transfers, borrowing and savings, Subjective wellbeing
- Individual level data: Household Roster, Education, Health, Fertility, Employment, Aids, and Anthropometrics
- Housing: Ownership and expenditures
- Access to facilities
- Health: Access to health care providers
- Food and non-food consumption and expenditures
- Durables: consumption & expenditures
- Employment: Jobs
- Self-employment and business
- Farming: Plots, Crops, Farming equipment, Agricultural inputs, Labour, Forestry, Livestock, Fishing
- Transfers, borrowing and savings
- Other income
- Social capital
2008 extension survey:
- Household Information
- Land Management
- Forest Use
- Individual Financial Information
- Shocks and Vulnerability
- Incidence of Shocks and Household Responses
- Future Shocks
- Preventive Health
- Program Participation
- Household Financial Information
- Justice for the Poor
- Community Trust and Decision Making
- Opinion and Perceptions of the Law
- Local Institutions
- Dispute Resolution
Domains: Urban/rural; Regional
Producers and sponsors
National Statistics Directorate
SAMPLE DESIGN FOR THE 2007-2008 SURVEY:
The TLSLS sample was designed to have two components: (i) a cross-sectional component of 4,500 households selected with the intention of representing the current population of Timor-Leste, and (ii) a panel component of 900 households, where half of the 2001 TLSS sample of 1800 households are randomly selected and re-interviewed, with the purpose to evaluating changes in the living conditions for the same set of households between the two surveys. However, this panel component is not being released at this time, so it will be neither covered in the rest of the documentation nor included in the data files. The cross-sectional component is expected to provide independent estimates for rural and urban areas of each of five recently defined regions, which are groups of districts defined as follows:
- Region 1: Baucau, Lautem and Viqueque;
- Region 2: Ainaro, Manufahi and Manatuto;
- Region 3: Aileu, Dili and Ermera;
- Region 4: Bobonaro, Cova Lima and Liquiçá; and
- Region 5: Oecussi.
The cross-sectional sample is selected in two stages. In the first stage, 300 Census Enumeration Areas (EAs) are selected as the primary sampling units (PSUs). In the second stage, 15 households are selected in each EA. The design recognizes ten explicit strata - the Urban and Rural areas in each of the five regions. The allocation of the 300 cross-sectional PSUs among regions resulted from the following line of reasoning:
- In spite of their different populations and total number of households, sampling theory dictates that a sample of the roughly the same size (60 EAs) should be allocated to each region in order to produce estimates of similar quality for each of them.
- A similar case could have been made for allocating a sample of the same size (30 EAs) to urban and rural areas within each region, but since the definition of urban and rural areas outside Dili was still a matter of discussion, it was decided to opt for an allocation closer to proportional: 25 EAs in Urban areas and 35 EAs to Rural areas.
- Region 5 represents a special case. It is composed of a single district of difficult access (Oecussi,) that ought to be the responsibility of a dedicated team. This imposed a total sample size of 50 EAs for this region, of which only 48 can be allocated to the cross-sectional component since the panel component contains two EAs in Oecussi.
- The capacity thus liberated to visit an additional 12 EAs in the rest of the country was devoted to reinforce the urban sample in Region 3, where Dili is located.
The first sampling stage used the list of 1,163 Census Enumeration Areas (EAs) generated by the 2004 Census as a sample frame. Within each stratum, the allocated number of EAs was selected with probability proportional to size (pps) using the number of households reported by the census as a measure of size. No efforts were made to append the smaller EAs to neighboring EAs, or to segment the larger EAs in order to make the size of the primary sampling units (PSUs) more uniform.
The second sampling stage used an exhaustive household listing operation in all selected EAs as its sample frame. Sample households in each EA were selected from the list by systematic equal probability sampling.
As a result of the relatively large sampling fraction in some of the strata, certain large EAs were selected more than once by the pps procedure adopted at the first sampling stage. In fact, the cross-sectional sample only consists of only 269 (rather than 300) different EAs. This necessitated selecting a multiple of 15 households (rather than just 15 households) in the EAs that were selected more than once. The final cross-sectional sample consists of 4,477 households. Table 2 shows the distribution of the total TLSLS sample across the rural and urban areas of the five main regions in the country. The sample can be considered representative at national level as well as at the level of the ten domains represented by the rural and urban areas of the five regions.
Lastly, it may be helpful to clarify the definition or urban and rural areas. At the time of the 2001 TLSS, 71 of Timor-Leste's 498 sucos were conventionally qualified as urban, of which 31 sucos in the Dili and Baucau districts were qualified as major urban centers. By the time of preparation of the sample design for the 2007 TLSLS, 60 of the 498 sucos defined by the 2001 Suco Survey were conventionally qualified as urban. The partition of the country into sucos was also modified in September 2004. With the amalgamation of several sucos, the original 498 sucos were now collapsed into 442. Many of the rearrangements took place in urban areas with the result that the 60 "old" sucos are now considered urban only constitute 38 "new" sucos.
SAMPLE DESIGN FOR THE 2008 EXTENSION SURVEY
Sampling for the TLSLS2 - Extension survey was a sub-sample of the original TLSLS2 sample. The TLSLS2 field work was divided into 52 "weeks", with each week being a random subset of the total sample. The sub-sample was chosen by randomly selecting 19 weeks from the original field work schedule.
Each week contained seven Primary Sampling Units (PSUs) for a total of 133 PSUs. In each PSU the teams were to interview 12 of the original 15 households, with the remaining three to serve as replacements. The total nominal sample size was thus 1596.
Following the collection and initial analysis of the data, it was determined that data from one district, Manatuto, and partially from another district, Oecussi, were of insufficient quality in certain modules. Therefore it was decided to repeat the survey in another 25 PSUs of these two districts - six in Manatuto, and 19 in Oecussi. The additional PSUs chosen were randomly selected within the two districts from the remaining non-panel PSUs in the original TLSLS2 sample.
SELECTION PROBABILITIES AND RAISING FACTORS FR THE 2007-2008 SURVEY:
For the cross-sectional sample of TLSLS, the selection probabilities and raising factors are determined in accordance with the sample design described below.
The probability of selecting Census Enumeration Area ij in stratum i is
Pij = (Mi*Nij) / Ni
where Nij is the number of households in the EA (as reported by the 2004 Census), Ni is the total number of households in the stratum (also as per the 2004 Census) and mi is the number of EAs selected in the stratum.
The probability of selecting household ijk in EA ij of stratum i is
Pijk = Pij*(15/N'ij)
where N’ij is the number of households in the EA, as per the household listing operation.
The raising factor or weight Wijk for household ijk is the inverse of the selection probability pijk. If the number n’ij of households found at the time of the listing operation were equal to the number nij recorded by the census in all EAs, the sample would be self-weighted in each stratum, with a constant raising factor equal to ni/15mi. In practice the numbers nij and n’ij will seldom be equal but often close to each other, meaning that the samples will not be exactly self-weighted, but quite approximately so.1
[Note: Strictly speaking, the above formulae are valid only when the size of the EA is such that it can be selected at most once by the pps procedure. However, the artifact of selecting 15t households in the second stage whenever an EA is selected t times in the first stage has the effect of making them applicable to compute raising factors even for the large EAs where that may not be the case. Formula (2) may be inadequate if the actual size n'ij of EA ij happens to be less than 15. In that (quite unlikely) case, all households in the EA will need to be visited, and pijk simplifies to pij.]
The household weights are further adjusted such that the population totals as estimated from the full sample match the demographic projections for mid-2007 for each stratum. This corresponds to a mid-2007 total population for Timor-Leste of 1,047, 632 persons.
[Note: This population total relates to the medium-level projection in DNE (2007), Population Projections 2004-2050: Analysis of Census Results, Report 1, General Population Census of Timor-Leste 2004.]
WEIGHTS FOR THE 2008 EXTENSION SURVEY:
Due to the necessity of additional interviews, there are three possible combinations of the data, with each combination having its own set of weights:
(1) the original extension data
(2) the original data, excluding the "questionable" data and including the additional interviews
(3) the complete data, including both all the original data and the additional interviews
Therefore three different sets of sampling weights have been calculated.
The sample weights for the extension survey are indirect weights based on the original probability weights calculated for the TLSLS2. The TLSLS2 weights were calculated by Juan Muñoz of Sistemas Integrals. These weights were based on each household's selection probability, and then scaled by an adjustment factor, intended to match the demographic projections for the population of urban and rural areas in the five major regions of Timor Leste in mid-2007.
As the extension survey was selected as a sub-sample of the original TLSLS sample, the original weights were used as a basis for the construction of the extension weights. In consultation with Juan Muñoz, it was decided to also use adjustment factors, to have the population estimates of the extension survey match the demographic projections in urban and rural areas of the five regions, for each of the three possible combinations of data described above.
The indirect weights are constructed by determining a scaling factor for the original weights which would bring the estimated population of the ten strata up to the same level as the original population projections. Separate scaling factors were calculated for each of the three possible combinations of the data. (Scaling factors are included in weights provided in the dataset and do not need to be added to analysis.)
The scaling factors are constant across the three sets of weights for regions 1, 3 and 4 - those unaffected by re-interviews. Region 2 shows a small change between factors 1 and 2, based on the slightly different composition of the households but having the same sample size. There is a decrease between factors 1 and 2 and factor 3 as the third combination of the data includes both the original and re-sampled households, therefore effectively over-sampling region 2. The reduction in the scaling factor and by extension the weights corrects estimates for this over-sampling.
Region 5 (Oecussi) shows a large change between factors 1 and 2, in addition to the expected decrease due to over-sampling in factor 3. This large change results from the difficulty of exactly reproducing the original stratification of the PSUs into urban and rural areas in Oecussi. Therefore urban households in Oecussi were effectively over-sampled during the original extension, then under-sampled during the re-interviews. Because of the resulting large differences in factors 1 and 2, it was decided that in Oecussi alone, it would be more logical to calculate the adjustment factors based on the population as a whole rather than including the urban/rural stratifications.
None of the estimates from the extension data with the indirect weights are statistically different from the original estimates based on the TLSLS2 data, nor are these estimates statistically significantly different from each other.
Dates of Data Collection
Data Collection Mode
Data Collection Notes
2007-2008 SURVEY FIELDWORK
The TLSLS was designed to run over a period of a full year in order to better account for any seasonal variation in different indicators. In addition, the fieldwork was designed to be more or less evenly spread throughout the country over the year. The TLSLS was launched on March 27, 2006. However, after about eight weeks of fieldwork, the survey had to be suspended due to the outbreak of conflict in the country.
The survey was resumed on January 9, 2007 and survey operations progressed without interruption since then. Fieldwork for the survey concluded on January 22, 2008.
At the time of the resumption of the survey, a decision was made to revisit the households who were interviewed in 2006 prior to the interruption of the survey. In particular, 351 households had been visited in 2006. Of these, 317 households were revisited during December 2007-January 2008. The remaining 34 households could not be found at the time of the revisits, and instead an additional 41 new households were interviewed as replacement households. In order to maintain a sample for a continuous period of a year, the final TLSLS sample thus excludes the 351 households interviewed in 2006 and instead includes the 358 revisited or replaced households.
Given the challenges of the turbulent political and security situation during some periods in 2007, the fieldwork schedule had on occasion to be slightly modified to accommodate concerns of security and feasibility of fieldwork. Despite this,the distribution of the sample by month of interview and by region and rural and urban areas indicates a sample that is well-spread through the year, which should allay any concerns of intra-year seasonality.
2008 EXTENSION SURVEY
As one of the objectives of the survey was to measure how much the head of the household knows about the financial activities of all household members, and to be able to produce gender disaggregated analytical work, it was decided that concurrent interviews would be conducted with the head of the household and his/her spouse, to prevent them from contaminating the responses of the other. To accomplish this, the interview was conducted in two parts. During the first part, one interviewer would complete the roster and agricultural modules with the household head, or the household member best able to respond to these questions. They would then schedule a time to return to the household to interview the household head and spouse separately. Two separate books were therefore used.
This survey also included the randomized assignment of the justice and vulnerability modules to the household head or spouse. This was done using a randomized number and designation printed on the cover of the questionnaire. Additionally, this designation was used to assign the household either the short (household head only) or long (all household members over the age of 15) finance module.
The pre-printed random number designated the household into an "odd" or "even" category. The randomization was generally well followed with the exception of one team. All of the interviews conducted by this team are included in the "questionable" category. Any gender disaggregated analysis should therefore exclude the "questionable" households and include the "re-interview" households, using sampling weights w2.
All households were asked sections 1 and 2 to the respondent best able to answer. Interviewers asked even households the justice module to the household head, the vulnerability module to the spouse, and the finance module to all household members. Interviewers asked odd households the vulnerability module to the household head, the justice module to the spouse, and the finance module to the household head only.
The division of the questionnaire into household head and spouse was undertaken to guarantee the separation of the two respondents. During field work, while a priority was placed on assigning female interviewers to female respondents and male interviewers to male respondents, this was not always possible due to scheduling conflicts.
To ensure that the correct households were re-surveyed during the TLSLS2-X, the covers of the questionnaires were individually pre-printed with the household location information and the designation of "odd" or "even". Additionally, on the reverse side of the cover, the roster and plot lists from the TLSLS-2 questionnaire were printed as reference.
The roster for the TLSLS2-X listed all current household members. Interviewers compared this information to the pre-printed roster information from the TLSLS2 survey to identify new members. New members were asked a series of questions relating to age, marital status, education, etc, which were skipped for existing members. All members were asked questions related to preventative health care. Interviewers are then asked to compare the new roster with the pre-printed and determine the whereabouts and reason for leaving of all household members in the TLSLS2 survey who were no longer part of the household.
Note: In the case of a female headed household, or a male headed household without a spouse, the procedures changes for the justice and vulnerability modules. The vulnerability module was asked to the most appropriate respondent of the specified gender, while the justice module was skipped. In some cases, a second choice was not available for the vulnerability module and it was also skipped. The actual respondent can be linked to the roster through their personal identification number (pid). Also note that surveys were marked as complete if the interviewer followed the correct procedures, not if all sections are actually completed. Therefore, some surveys marked complete will be missing information for the vulnerability or justice modules.
Organization and Timing of 2008 Extension Survey:
Field work for the TLSLS2-X was carried out by field teams from the Direcção Nacional de Estatística [DNE], with training and supervision provided by the World Bank, and Mekong Economics and Sistemas Integrales, private consulting firms. Each field team consisted for three interviewers, a supervisor, a data entry operator, and a driver. Data entry was concurrent with data collection, and performed in the field using laptop computers.
The questionnaire was developed by the individual topic teams within the World Bank in the fall of 2007. Pilot testing was conducted in January 2008. Unforeseen political events delayed the start of the training until May 2008. Ten days of training was conducted between May 13 and May 23 in Dili. Training was conducted by Sistemas Integrales with assistance from the World Bank, and consisted of both classroom exercises and field training. Field exercises were conducted in Dili, Alieu and Liquica districts.
Field work was originally scheduled for 20 weeks beginning in May and being completed in August. Questions as to the quality of the data arose during the compiling and cleaning process in September 2008. A "spot-check" data quality review mission was conducted in October 2008, and at that time it was determined that further interviews would be necessary in two districts. The additional interviews were conducted in November and December 2008. Data cleaning and compilation took place in January and February 2009, with the finished dataset being released to World Bank team members in February 2009. Plans are still on-going for public dissemination of the data.
2008 Extension Survey Data Cleaning
The TLSLS2-X had a significant number of responses in which the response is "other". In general, if the response clear fit into a pre-coded response category, it was recoded into that category during the cleaning and compilation process. Some responses where additional information was provided were not recoded even though they clearly fit into pre-coded categories. For example, "agriculture project" would be recoded into the "agriculture" category, while "community garden" would not. Data users can either use the additional information, or re-code into categories as they see fit.
Potential Data Quality Issues in 2008 Extension survey
Similarly to the individual roster of the previous section, the plots listed in the previous survey are listed on the pre-printed cover page and all changes noted. The agricultural section, similarly to the other sections, suffers from problems with open-ended questions. This is particularly the case for the question asking what community restrictions are placed on the clearing of forest land (section 2d). The translation from the original question was vague (using the Tetun word for "boundary" for "restriction,") and therefore many of the responses relate to physical boundaries on the land, such as stone walls and tree lines. Additionally, the translation of all answers from Tetun into English is imperfect, and those wishing to use this information for analytical purposes are advised to also refer to the original Tetun. Analysts should be careful in using the data from the open ended questions because of translation problems. Also, it was noted during the training and field work that many interviewers had significant difficulties understanding definitions with some of the land management and investment questions. In general, however, all agricultural data may be used for analysis, sampling weights w3.
It should be noted that the quality of the data for the finance experiment (comparing the knowledge of the household head to that of other household members) was not sufficient for the experiment to be deemed a success. Subsequent spot-checking reveled that in many cases, interviewers asked the household head about the financial activities of various household members instead of asking them directly. Therefore this data should only be used to measure the access to finance at the household level. The finance sections were not repeated during the additional interviews in the replacement PSUs. Sampling weights w1 should be used when doing any analysis with this data.
Shocks and Vulnerability:
It was determined following the initial round of data collection that the shocks and vulnerability module had some issues with uneven interview quality. Two reasons were listed as potential causes of the data quality issues: (1) fundamental inability to adequately translate both the word and concept of a "shock" into the Timorese context, and (2) incomplete / questionable responses to the health shock questions in particular. Analysis for health shocks should drop the "questionable" households and use the "re-interview" households, sampling weights w2.
Justice for the Poor:
Similar to the shocks and vulnerability module, the justice module included a long series of follow up questions if the household indicated having experienced a dispute during the recall period. Again, the number of disputes experienced by the household seemed extremely low compared to expectations. This was particularly a problem with the Manatuto district in which no disputes were recorded during the first set of TLSLS2-X interviews. Analysis for the disputes section of the justice module should drop the "questionable" households and use the "re-interview" households, sampling weights w2. The justice model also has a number of instances in which the specifications for "other" were not recorded. Every effort was made to ensure this data was as complete as possible, but gaps do remain. Also, data users should use caution when using the imputed rank variable in section 5D. The rank in terms of importance was not explicitly captured in the data entry software, and the rankings therefore had to be imputed from the order they were listed in the original data entry. Inconsistencies may exist in this variable.
In receiving these data it is recognized that the data are supplied for use within my organization, and I agree to the following stipulations as conditions for the use of the data:
1. The data are supplied solely for the use described in this form and will not be made available to other organizations or individuals. Other organizations or individuals may request the data directly.
2. Three copies of all publications, conference papers, or other research reports based entirely or in part upon the requested data will be supplied to:
National Statics Directorate
Caicoli, Dili, Timor Leste
The World Bank
Development Economics Research Group
LSMS Database Administrator
1818 H Street, NW
Washington, DC 20433, USA
tel: (202) 473-9041
fax: (202) 522-1153
3. The researcher will refer to the 2007 Timor Leste Living Standards Measurement Survey as the source of the information in all publications, conference papers, and manuscripts. At the same time, the National Statistics Directorate is not responsable for the estimations reported by the analyst(s).
4. Users who download the data may not pass the data to third parties.
5. The database cannot be used for commercial ends, nor can it be sold.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
World Bank LSMS
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.