A detailed description of the sampling methodology is available in appendix to the document "Basic Information Document".
The TLSS sample was designed to allow reliable estimation of poverty and most variables for a variety of other living standard indicators at the various domains of interest based on a representative probability sample on the level of:
• Tajikistan as a whole
• Total urban and total rural areas
• The five main administrative regions (oblasts) of the country: Dushanbe, Rayons of Republican Subordination (RRS), Sogd, Khatlon, and Gorno-Badakhshan Autonomous Oblast (GBAO)
The last census was conducted in 2000 and covered all five main administrative regions (oblasts) of the country (Dushanbe, RRS, Sogd, Khatlon, and GBAO). Each oblast was further subdivided into smaller areas called census section, instructor's sector and enumeration sector (ES). Each ES is either totally urban or rural. The list of ESs has census information on the population of each ES, and the ES lists were grouped by oblast.
In 2005, UNICEF implemented a Multiple Indicator Cluster Survey (MICS05) in Tajikistan during which an electronic database of the ES information was created. Information in this database included: oblast, rayon, jamoat, settlement type, city/village, ES code, and population. Information from this database was used in the sample design of the TLSS07.
The total number of clusters for the TLSS07 was established as 270 and total number of households per cluster was established as 18, resulting in a sample size of 4,860. The sample size was determined by taking into account:
• The reliability of the survey estimates on both regional and national level
• Quality of the data collected for the survey
• Cost in time for the data collection
• An oversample in 7 rayons in Khatlon
Three questionnaires were used to collect information for the TLSS07: a household questionnaire, a female questionnaire for recording information about women of child bearing age, and a community questionnaire. These questionnaires were based on the TLSS questionnaires used in 2003, but had some changes. Questions were added to existing modules and new modules were added to collect information to be used for MICS analyses. These included HIV/AIDS awareness, and Immunizations and Anthropometric Measurements for children 0 to 5 years old. Other new modules on Migration, Financial Services, Subjective Poverty and Food Security, and Subjective Beliefs were also added. The Labor Market Module was changed substantially from 2003 to better look at the informal labor market. The food expenditures module included additional food products. The HIV/AIDS questions were removed from the female questionnaire and were applied to all household members 12 to 49 years old.
The Second Round Household Questionnaire was shorter and was used primarily to collect additional information that was not possible to collect in the First Round. Because the First Round questionnaire was very long, it was decided to collect some information in a second round of visits to the households. The Household Questionnaire was the main instrument used during the Second Round. The female questionnaire was only used if females were added to the household after the First Round and the community questionnaire was not repeated. In the Second Round Household Questionnaire, the time reference period for the Food
Security module was reduced from 4 weeks to 2 weeks. This was done because in the households visited at the beginning of the Second Round, a 4 week period would have included the last portion of the Ramadan period.
Data Entry and Cleaning
The data entry program was designed using CSPro, a data entry package developed by the US Census Bureau. This software allows programs to be developed to perform three types of data checks: (a) range checks; (b) intra-record checks to verify inconsistencies pertinent to the particular module of the questionnaire; and (c) inter-record checks to determine inconsistencies between the different modules of the questionnaire.
The data from the First Round were key entered at the Goskomstat headquarters in Dushanbe starting 4 October 2007 through 25 November 2007. The Second Round and Sughd data were key entered from 26 November 2007 through 12 December 2007. All of the data were double entered with both the First Round, Second Round and Sughd re-collection double entry being completed by 22 January 2008.
The data cleaning process began in February 2008 and was completed at the end of May 2008.
There are three separate data bases with the data from the TLSS07. The data from each data collection is maintained separately. The data sets have similar names in each of the three separate data collections. First Round data sets have names in the form of “r1mnp” where “n” is the number of the module, and “p” is the part of the module (if any). Data from the Subjective Poverty module would be stored as “r1m9” and data from the Migration module, Part C Family Members Living Away from the Household would be stored as “r1m2c”. Second Round data set names have a similar form “r2mnp”. Data sets from the Sughd collection replace the “m” of the First Round with “sm”, such as sm12a1.
The variable names have a similar format. Each variable name includes the module in which the variable is found and the question number. For example, question 10 in Module 4 Health, Part B Utilization of Outpatient Health Care is “m4b_q10”. The variable names in all three of the data collections have the same format.
In addition to the individual roster files for each data base, there is also one roster file for all three data bases, rosterall. This roster file contains the information on all of the households and household members who are included in the data. There is a variable (source) indicating if the household/member is: (a) in Round 1 only; (b) in Round 2 only; (c) in Round 1 and Round 2; or (d) in the Sughd data. It is important to pay attention to this variable as the recall periods for the Subjective Poverty and Food Security Module (9A) is the last 4 weeks in the First Round, but changed to the last 2 weeks in the Second Round and the Sughd collection. In addition, the order of the question in the Expenditure On Food In The Last 7 Days, Module 10, changed
Use of the dataset must be acknowledged by including a citation which would include:
- Identification of the Primary Investigator
- Title of the survey (including the year of implementation)
- Survey reference number
- Source and date of download
Tajikistan State Statistical Agency. Tajikistan Living Standards Survey (TLSS) 2007. Ref. TJK_2007_TLSS_v01_M. Dataset downloaded from www.microdata.worldbank.org on [date]
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.
LSMS Data Manager
The World Bank
World Bank, Development Economics Data Group
Production of metadata
v02 (August 2016)
- The survey title was changed to Tajikistan Living Standards Survey (TLSS) to match the questionnaire
- Study ID changed from TJK_2007_LSMS_v01_M to TJK_2007_TLSS_v01_M