Living Standards Measurement Survey 2002 (Wave 1 Panel)
Over the past decade, Albania has been seeking to develop the framework for a market economy and more open society. It has faced severe internal and external challenges in the interim – extremely low income levels and a lack of basic infrastructure, the rapid collapse of output and inflation rise after the shift in regime in 1991, the turmoil during the 1997 pyramid crisis, and the social and economic shocks accompanying the 1999 Kosovo crisis. In the face of these challenges, Albania has made notable progress in creating conditions conducive to growth and poverty reduction.
A poverty profile based on 1996 data (the most recent available) showed that some 30 percent of the rural and some 15 percent of the urban population are poor, with many others vulnerable to poverty due to their incomes being close to the poverty threshold. Income related poverty is compounded by the severe lack of access to basic infrastructure, education and health services, clean water, etc., and the ability of the Government to address these issues is complicated by high levels of internal and external migration that are not well understood.
To date, the paucity of household-level information has been a constraining factor in the design, implementation and evaluation of economic and social programs in Albania. Multi-purpose household surveys are one of the main sources of information to determine living conditions and measure the poverty situation of a country, and provide an indispensable tool to assist policymakers in monitoring and targeting social programs.
Two recent surveys carried out by the Albanian Institute of Statistics (INSTAT) – the 1998 Living Conditions Survey (LCS) and the 2000 Household Budget Survey (HBS) – drew attention, once again, to the need for accurately measuring household welfare according to wellaccepted standards, and for monitoring these trends on a regular basis. In spite of their narrow scope and limitations, these two surveys have provided the country with an invaluable training ground towards the development of a permanent household survey system to support the government strategic planning in its fight against poverty.
In the process leading to its first Poverty Reduction Strategy Paper (PRSP; also known in Albania as Growth and Poverty Reduction Strategy, GPRS), the Government of Albania reinforced its commitment to strengthening its own capacity to collect and analyze on a regular basis the information it needs to inform policy-making.
In its first phase (2001-2006), this monitoring system will include the following data collection instruments: (i) Population and Housing Census; (ii) Living Standards Measurement Surveys every 3 years, and (iii) annual panel surveys.
The Population and Housing Census (PHC) conducted in April 2001, provided the country with a much needed updated sampling frame which is one of the building blocks for the household survey structure.
The focus during this first phase of the monitoring system is on a periodic LSMS (in 2002 and 2005), followed by panel surveys on a sub-sample of LSMS households (in 2003, 2004 and 2006), drawing heavily on the 2001 census information. The possibility to include a panel component in the second LSMS will be considered at a later stage, based on the experience accumulated with the first panels.
The 2002 LSMS was in the field between April and early July, with some field activities (the community and price questionnaires) extending into August and September. The survey work was undertaken by the Living Standards unit of INSTAT, with the technical assistance of the World Bank. The present document provides detailed information on this survey. Section II summarizes the content of the survey instruments used. Section III focuses on the details of the sample design. Sections IV describes the pilot test and fieldwork procedures of the survey, as well as the training received by survey staff. Section V reviews data entry and data cleaning issues. Finally, section VI contains a series of annotations that all those interested in using the data should read.
Kind of data
Sample survey data [ssd]
Domains: Tirana, other urban, rural; Agro-ecological areas (coastal, central, mountain)
Unit of analysis
Producers and sponsors
Institute of Statistics of Albania
The World Bank
The Republic of Albania is divided geographically into 12 Prefectures (Prefekturat). The latter are divided into Districts (Rrethet) which are, in turn, divided into Cities (Qyteti) and Communes (Komunat). The Communes contain all the rural villages and the very small cities. For the April 2001 General Census of Population and Housing census purposes, the cities and the villages were divided into Enumeration Areas (EAs). These formed the basis for the LSMS sampling frame.
The EAs in the frame are classified by Prefecture, District, City or Commune. The frame also contains, for every EA, the number of Housing Units (HUs), the number of occupied HUs, the number of unoccupied HUs, and the number of households. Occupied dwellings rather than total number of dwellings were used since many census EAs contain a large number of empty dwellings. The Housing Unit (defined as the space occupied by one household) was taken as the sampling unit, instead of the household, because the HU is more permanent and easier to identify in the field.
A detailed review of the list of census EAs shows that many have zero population. In order to obtain EAs with a minimum of 50 and a maximum of 120 occupied housing units, the EAs with zero population were first removed from the sampling frame. Then, the smallest EAs (with less than 50 HU) were collapsed with geographically adjacent ones and the largest EAs (with more than 120 HU) were split into two or more EAs. Subsequently, maps identifying the boundaries of every split and collapsed EA were prepared
Sample Size and Implementation
Since the 2002 LSMS had been conducted about a year after the April 2001 census, a listing operation to update the sample EAs was not conducted. However, given the rapid speed at which new constructions and demolitions of buildings take place in the city of Tirana and its suburbs, a quick count of the 75 sample EAs was carried out followed by a listing operation. The listing sheets prepared during the listing operation became the sampling frame for the final stage of selection.
The final sample design for the 2002 LSMS included 450 Primary Sampling Units (PSUs) and 8 households in each PSU, for a total of 3600 households. Four reserve units were selected in each sample PSU to act as replacement unit in non-response cases. In a few cases in which the rate of migration was particularly high and more than four of the originally selected households could not be found for the interview, additional households for the same PSU were randomly selected. During the mplementation of the survey there was a problem with the management of the questionnaires for a household that had initially refused, but later accepted, to fill in the food diary. The original household questionnaire was lost in the process and it was not possible to match the diary with a valid household questionnaire. The household had therefore to be dropped from the sample (this happened in Shkoder, PSU 16). The final sample size is therefore of 3599 households.
The sampling frame was divided in four regions (strata), Coastal Area, Central Area, and Mountain Area, and Tirana (urban and other urban). These four strata were further divided into major cities, other urban, and other rural. The EAs were selected proportionately to the number of housing units in these areas.
In the city of Tirana and its suburbs, implicit stratification was used to improve the efficiency of the sample design. The implicit stratification was performed by ordering the EAs in the sampling frame in a geographic serpentine fashion within each stratum used for the independent selection of EAs.
The sample is not self-weighted. In order to obtain correct estimates the data need to be weighted. A file with household weights is included in the dataset (filename: weights.dta, variable: weight). When using individual rather than household variables an individual weight should be created by multiplying the household weight by the household size.
The survey is representative for Tirana, other urban and rural areas, as well as for Tirana and the three main agro-ecological/economic areas (Coastal, Central and Mountain).
Selection of households
Twelve valid households (HH's) were selected systematically and with equal probability from the Listing Forms in Tirana and 12 housing units (HU's) from census forms in the other areas. Once the 12 HH's were selected, 4 of them were chosen at random and kept as reserve units. During the fieldwork, the enumerator only received the list of the first eight HH's plus a reserve HH. Each time the enumerator needed an additional reserve HH, she had to ask the supervisor and explain the reason why the reserve unit was needed. This process helped determine the reason why reserve units were used and provided more control on their use.
If a HH was not able to have its enumeration completed, the enumerator used the first reserve unit. Full documentation was required of every non-completed interview. If in one PSU more than 4 HH selected were invalid, other units from that PSU were randomly selected by the Central Office as replacement units to keep the enumerator load constant and maintain a uniform sample size in each PSU. This only occurred in a couple of cases.
For the listing of the 75 selected PSU's in Tirana, the census data and the EA maps were used as a base, and then buildings that had been added or demolished since the census were identified. For the PSUs that had buildings added, new lines were added to the listing to reflect each new housing unit before the 8 HH's and 4 reserve were selected. The HU's identified as invalid (because they had become vcant, non-residential, or had been demolished) were removed from the frame.
In Albania it is possible to find more than one household in one housing unit. In Tirana, where the census forms that formed the basis of the listing for the 2002 LSMS listed the households as opposed to the housing units, each line in the listing sheet was made to correspond with one and only one household, so that every household can have the same probability of being selected as any other household. In the rest of the country, where the census listing of housing units was used, interviewers were instructed to ask at the outset if there was more than one household in the housing unit (with the definition that a household sleeps under the same roof and pools resources for eating (eats out of the same pot)). In the few cases where there was more than one household in the HU, enumerators were instructed to randomly select one household (e.g. by flipping a coin) and interview the household thus chosen. A full description of the original sample design by the sampling consultant, Armando Levinson, can be found in a separate document. Also, details on the actual field implementation are contained in a separate spreadsheet.
Dates of collection
Mode of data collection
Four survey instruments were used to collect information for the 2002 Albania LSMS: a household questionnaire, a diary for recording household food consumption, a community questionnaire, and a price questionnaire. The household questionnaire included all the core LSMS modules as defined in Grosh and Glewwe (2000), plus additional modules on migration, fertility, subjective poverty, agriculture, and nonfarm enterprises. Geographical referencing data on the longitude and latitude of each household were also recorded using portable GPS devices.
Geo-referencing will enable a more efficient spatial link among the different surveys of the system, as well as between the survey households and other geo-referenced information. Given the panel nature of the poverty monitoring system, geo-referencing is also an important tool for facilitating the tracking of households in future surveys.
The choice of the modules was aimed at matching as much as possible the specificity of Albania in terms of data needs, as driven by pressing policy questions. Their design (e.g. questions asked, their sequence, units and time-frames used) was also adapted to fit the Albanian reality. Household membership in this survey is defined as being away from the household for less than six months. Deceased individuals, lodgers, hired workers and servants are never considered household members. Guests who stay with the household for six months and over, infants of less than six months, new arrivals (such as newly weds) are considered household members. The household head was counted as a household member if he or she had been away less than 12 months, rather than the 6 month limit for anyone else absent.
The questionnaire was divided in two sections, and was administered to households in two visits, one section per visit. During the second visit the interviewer would also collect additional information of use for the future tracking of the household in the next waves of the panel. This information was collected on a sheet provided for this purpose at the beginning of Section 2 in the main questionnaire.
The Diary for Recording Daily Household Consumption (also known as the booklet) was left in the household by the interviewer during the first visit for the household to compile, and collected during the second visit. Upon collection, interviewers took care of checking the entries (also with the help of a checklist provided at the end of the booklet) and correct them as appropriate with the help of the most knowledgeable person in the household. The diary consists of:
A. A cover page (for metadata information);
B. Instructions for the household on how to record consumption;
C. Fourteen (i.e. one per day) three-part sections for the recording of (1) food products purchased daily; (2) non-purchased food products consumed by the household (e.g. from
own production or payments in kind); (3) food eaten outside the home (e.g. at work, in restaurants);
D. A checklist for use by the interviewer with a list of the 14 main food products consumed in Albania.
A specific column was provided for the interviewers (not the household) to record the ‘reference period’ for bulk purchases of food. Whenever large quantities of a specific item were recorded, the interviewer asked the household –upon collecting the diary- to specify the expected period over which the said quantity would be consumed.
The last section of the diary, the checklist, was compiled by the interviewer, with the help of the household most knowledgeable person, upon collection of the diary. Interviewers were instructed to check, for 14 main food staples, whether any consumption of the item had been recorded in the diary. Whenever an item had not been recorded the interviewer would ask the respondent to report whether the item (a) had not been used in the 14 day period, or (b) had been consumed but the household had forgotten to record its consumption, or else (c) had been consumed by the household drawing on stocks purchased or produced outside the 14 day period.
If the inclusion of an item had simply been forgotten the interviewer would then fill the appropriate section of the diary by asking the household to recall the details of that onsumption. If the household reported consuming an item purchased before the beginning of the 14 day period, then information on the frequency of purchase, quantity, unit of measure and value of the purchase were recorded in the columns provided to this end in the checklist.
Data users should therefore make sure of using the checklist, as well as column 7 in the table on daily purchases, to supplement the food consumption information included in the main part of the food diary. Extra care should be exercised when using this information because it appears that in practice the use of the checklist has not been consistent across interviewers. Some interviewers have recorded information in the checklist columns 4-7 even in cases where the purchase had taken place in the 14 days but its inclusion had been forgotten. In some other cases information on the same items are found both in the checklist and in the main part of the diary (purchase and own-production).
The Community Questionnaire had two slightly different formats in urban and rural areas. Essentially the same information was collected although a few questions do not appear in the same sequence in both versions of the questionnaire. These are essentially the questions on the population and name of the community, which appear in Section 1A in the urban version and in Section 2 in the rural version. When using data from these questions the analyst should make sure of integrating information that appears in different variables in the dataset. In a few cases in which a question did not make sense in an urban context the question was only asked in rural communities. All such differences are easily identified by comparing the questionnaires, and have also been shaded in the urban version to make it easier to identify them.
In rural areas the community was normally defined as a village and the inhabited area surrounding it. In urban areas the definition was less straightforward, and it was decided on a case by case basis by the core team and the supervisors with the objective of selecting areas that would be understood as communities by the respondents, from time to time adopting boundaries matching those of traditional neighborhoods, or administrative partitions of the urban areas (the baskhia or sector). In Tirana the community was identified with the mini-bashkia level of the administrative partition of the city.
The supervisors were instructed to administer the questionnaire to a group of persons reputed to be best informed about each module within a community (e.g. teachers for the education questions, doctors or hospital managers for health related ones). Whenever possible, the questionnaire was administered in groups and the prevailing response (in case of differing views) was recorded. When this was not possible, respondents were interviewed separately. In a majority of cases, however, the questionnaire was in practice administered to only one respondent, generally an elected or appointed community leader.
The fourth survey instrument used was the price questionnaire. The price questionnaire was sent out with the community questionnaires to the districts. However, while the 'rural version' of the community questionnaire was ready at the same time as the main household questionnaire, the 'urban version' was only finalized a few weeks later. Price data for the rural communities have therefore been collected at the same time as the household data. In urban areas prices were collected in the following weeks and as late as September. Given the low level of inflation this should hardly pose problems of comparability. The date of the community (and hence price) data collections are included in the dataset.
Data for 96 different items were collected in each community. Prices were generally collected in only one outlet, except for about twenty urban communities for which two or three price observations are available. Thirteen of the latter are urban areas in which the monthly price data collection regularly done by INSTAT for the consumer price index was used.
The coding for the survey made use of ISCO 88 and NACE codes for employment and industry activities respectively, and of COICOP codes for the food item recorded in the 14 day diary.
Besides the checks built-in in the DE program and those performed on the preliminary versions of the dataset as it was building up, and additional round of in depth checks on the household questionnaire and the food diary was performed in late September and early October in Tirana. Wherever possible data entry errors or inconsistencies in the dataset were spotted, the original questionnaires or diary were retrieved and the information contained therein checked. Changes were made to the August version of the dataset as needed and the dataset was finalized in October.
Data entry for all the survey instruments was performed using custom made applications developed in CS-Pro. Data entry for the household questionnaire was performed in a decentralized fashion in parallel with the enumeration, so as to allow for ‘real-time’ checking of the data collected. This allowed a further tier of quality control checks on the data. Where errors in the data were spotted during data entry, it was possible to instruct enumerators and supervisors to correct the information, if necessary revisiting the household, when the teams were still in the field. A further round of checks was performed by the core team in Tirana and Bank staff in Washington as the data were gathered from the field and the entire dataset started building up.
All but one of the 16 teams in the districts had one DEO, the Fier team had two, and there were four DEO’s for Tirana. Each DEO worked with a laptop computer, and was given office space in the regional Statistics Offices, or in INSTAT headquarters for the Tirana teams. The DEO’s received Part 1 of the household questionnaire from the supervisor once the supervisor had checked the enumerator’s work, within two days of the enumeration in the field. The DEO then entered the questionnaire on the custom program, noting from the error messages of the program where there were errors or omissions. These errors were then to be detailed on the appropriate page of the questionnaire so that the enumerator could correct them when they returned for the second visit to the household.
Once the DE of 8 questionnaires for a PSU were completed for Part 1, the questionnaires were returned to the supervisor who gave them to the enumerator for administering Part 2 in the field. After Part 2 was completed, and the errors or omissions noted from Part 1, the enumerator turned the questionnaires back to the supervisor, who in turn gave them to the DE operator for entering Part 2. If there were errors found in Part 2, the supervisor was then told and they either solved the problem, or sent the enumerator back to the household.
The data entry of the household questionnaires was completed by mid July 2002 and the data was all delivered to Tirana by the teams. The data entry of the food booklets was done on an separate data-entry program by DEO’s in Tirana. To improve accuracy and minimize data entry errors, the data were double-entered. The data entry began on July first and was completed on 31 July. By August 8 all the data for the household questionnaire and the food dairy had been entered. The questionnaires were all brought to Tirana and stored in INSTAT headquarters. The data entry for the community and price questionnaires took place during the first ten days of October.
In receiving these data it is recognized that the data are supplied for use within your organization, and you agree to the following stipulations as conditions for the use of the data:
1. The data are supplied solely for the use described in this form and will not be made available to other organizations or individuals. Other organizations or individuals may request the data directly.
2. Three copies of all publications, conference papers, or other research reports based entirely or in part upon the requested data will be supplied to:
The World Bank
Development Economics Research Group
LSMS Database Administrator
1818 H Street, NW
Washington, DC 20433, USA
tel: (202) 473-9041
fax: (202) 522-1153
3. The researcher will refer to the 2002 Albania Living Standards Measurement Survey as the source of the information in all publications, conference papers, and manuscripts. At the same time, the World Bank is not responsable for the estimations reported by the analyst(s).
4. Users who download the data may not pass the data to third parties.
5. The database cannot be used for commercial ends, nor can it be sold.
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including acronym and year of implementation)
- the survey reference number
- the source and date of download
Institute of Statistics of Albania. Albania Living Standards Measurement Survey 2002. Ref. ALB_2002_LSMS_v01_M. Dataset downloaded from [website/source] on [date].
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.