Living Standards Measurement Survey 2002 (Wave 2 Panel)
Bosnia and Herzegovina
In 2001, the World Bank in co-operation with the Republika Srpska Institute for Statistics (RSIS), the Federal Office of Statistics (FOS) and the Agency for Statistics of Bosnia and Herzegovina (BHAS), carried out a Living Standards Measurement Survey (LSMS).
The Living Standard Measurement Survey LSMS, in addition to collecting the information necessary to obtain a comprehensive as possible measure of the basic dimensions of household living standards, has three basic objectives, as follows:
1. To provide the public sector, government, the business community, scientific institutions, international donor organizations and social organizations with information on different indicators of the population's living conditions, as well as on available resources for satisfying basic needs.
2. To provide information for the evaluation of the results of different forms of government policy and programs developed with the aim to improve the population's living standard. The survey will enable the analysis of the relations between and among different aspects of living standards (housing, consumption, education, health, labor) at a given time, as well as within a household.
3. To provide key contributions for development of government's Poverty Reduction Strategy Paper, based on analyzed data.
The Department for International Development, UK (DFID) contributed funding to the LSMS and is also providing funding for a further two years of data collection for a panel survey, to be known as the Household Survey Panel Series (HSPS). Birks Sinclair & Associates Ltd. are responsible for the management of the HSPS with technical advice and support being provided by the Institute for Social and Economic Research (ISER), University of Essex, UK.
The aim of the panel survey is to provide longitudinal data through re-interviewing approximately half the LSMS respondents for two years following the LSMS, in the autumn of 2002 and again in 2003. The LSMS constitutes wave 1 of the panel survey so there will be three years of panel data available for analysis under current funding plans. For the purposes of this document we are using the following convention to describe the different rounds of the panel survey:
Wave 1 LSMS conducted in 2001 forms the baseline survey for the panel
Wave 2 Second interview of 50% of LSMS respondents in Autumn/Winter 2002
Wave 3 Third interview with sub-sample respondents in Autumn/Winter 2003
The panel data will allow the analysis of key transitions and events over this period such as labour market or geographical mobility and observe the consequent outcomes for the well-being of individuals and households in the survey.
The panel data will provide information on income and labour market dynamics within FBiH and RS. A key policy area is developing strategies for the reduction of poverty within FBiH and RS. The panel will provide information on the extent to which continuous poverty is experienced by different types of households and individuals over the three year period. And most importantly, the co-variates associated with moves into and out of poverty and the relative risks of poverty for different people can be assessed. As such, the panel aims to provide data, which will inform the policy debates within FBiH and RS at a time of social reform and rapid change.
Kind of data
Sample survey data [ssd]
Domains: Urban/rural/mixed; Federation, Republic
Unit of analysis
Producers and sponsors
State Agency for Statistics (BHAS)
Republika Srpska Institute of Statistics (RSIS)
Federation of BiH Institute of Statistics (FIS)
The World Bank
UK Department for International Development
The panel survey sample is made up of over 3,000 households drawn from the Living Standards Measurement Survey (LSMS) conducted by the World Bank in co-operation with the SIs in 2002. Approximately half the households interviewed on the LSMS were selected and carried forward into the panel survey. These households were re-interviewed in 2003 and will be interviewed for a third time in September 2004.
The 5,400 households interviewed on LSMS formed the sampling frame for the panel survey. The aim was to achieve interviews with approximately half of these (2,700) at wave 2 (1,500 in FBiH and 1,200 in RS). A response rate of 90% was anticipated (as the sample is based on households that have already co-operated with LSMS) and therefore the selected sample consisted of 3,000 households. Unlike the LSMS, the HSPS does not have a replacement element to the sample, only the original 3,000 issued addresses. This approach was new to the Supervisors and Interviewers and special training was given on how to keep non-response to a minimum.
The LSMS Sample
The LSMS sample design process experienced some difficulties which resulted in a sample with a disproportionately high number of households being selected in urban areas. Work by Peter Lynn from ISER identified the source of this problem by establishing the selection probabilities at each stage of the LSMS sampling process. Essentially, the procedures used for selecting households within municipalities would have been appropriate had municipalities been selected with equal probabilities. But in fact municipalities had been selected with probability proportional to size, and using different overall sampling fractions in each of three strata. The details are documented in a memo by Peter Lynn dated 25-3-2002. Consequently, household selection probabilities varied considerably across municipalities.
Compensating for the LSMS sample imbalance
Having established the selection probability of every LSMS household, it became possible to derive design-based weights that should provide unbiased estimates for LSMS. However, the considerable variability in these weights means that the variance of estimates (and hence standard errors and confidence intervals) is greatly increased. For the HSPS, there was an opportunity to reduce the variability in weights by constructing the subsample in a way that minimised the variability in overall selection probabilities. The overall selection probability for each household would be the product of two probabilities - the probability of being selected for LSMS, and the probability of being selected for HSPS, conditional upon having been selected for LSMS, i.e.
P(HSPS) = P(LSMS) * P(HSPS)/(LSMS)
Ideally, then, we would have set the values of P(HSPS)/(LSMS) to be inversely proportional to P(LSMS). This would have resulted in each HSPS household having the same overall selection probability, P(HSPS), so that there would no longer be an increase in the variance of estimates due to variability in selection probabilities. However, this was not possible due to the very considerable variation in P(LSMS) and the limited flexibility provided by a large overall sampling fraction for HSPS (3,000 out of 5,400).
The best that could be done was to minimise the variability in sampling fractions by retaining all the LSMS households in the (mainly rural or mixed urban/rural) municipalities where LSMS household selection probabilities had been lowest and sub-sampling only in the municipalities where LSMS selection probabilities had been much higher. In 16 of the 25 LSMS municipalities, all households were retained for HSPS. In the other 9 municipalities, households were sub-sampled, with sampling fractions ranging from 83% in Travnik to just 25% in Banja Luka and Tuzla.
To select the required number of households within each municipality, every group of enumeration districts (GND) was retained from LSMS. The sub-sampling took place within the GNDs. Households were sub-sampled using systematic random sampling, with a random start and fixed interval. For example, in Novo Sarajevo, where the sampling fraction was 1 in 2, 6 households were selected out of the 12 LSMS households in each GND by selecting alternate households. In Prijedor, where the fraction was 1 in 3, 4 out of 12 were selected by taking every third LSMS household. And so on.
The total selected sample for the HSPS consists of 3,007 households (1681 in the FBIH and 1326 in the RS).
The overall design weight for the HSPS sample will be the product of the LSMS weight for the household and this extra design weight (which will of course tend to increase the size of the smallest LSMS weights).
Eligibility for inclusion
The household and household membership definitions are the same standard definitions as used on the LSMS (see Supervisor Instructions, Annex A). While the sample membership status and eligibility for interview are as follows:
i) All members of households interviewed at wave 1 (LSMS) have been designated as original sample members (OSMs). OSMs include children within households even if they are too young for interview.
ii) Any new members joining a household containing at least one OSM, are eligible for inclusion and are designated as new sample members (NSMs).
iii) At each wave, all OSMs and NSMs are eligible for inclusion, apart from those who move outof-scope (see discussion below).
iv) All household members aged 15 or over are eligible for interview, including OSMs and NSMs.
Following rules and the definition of 'out-of-scope'
The panel design means that sample members who move from their previous wave address at either wave 2 or 3 must be traced and followed to their new address for interview. The LSMS sample was clustered and over the two waves of the panel some de-clustering will occur as people move. In some cases the whole household will move together but in others an individual member may move away from their previous wave household and form a new split-off household of their own.
All sample members, OSMs and NSMs, are followed at each wave and an interview attempted. This means that a four person household at Wave 1 could generate three additional households at wave 2 if three members, either OSMs or NSMs, move away to form separate households. This method has the benefit of maintaining the maximum number of respondents within the panel and being relatively straightforward to implement in the field.
Definition of 'out-of scope'
It is important to maintain movers within the sample to maintain sample sizes and reduce attrition and also for substantive research on patterns of geographical mobility and migration. The rules for determining when a respondent is 'out-of-scope' are as follows:
i. Movers out of the country altogether i.e. outside FBiH and RS
This category of mover is clear. Sample members moving to another country outside FBiH and RS will be out-of-scope for that year of the survey and not eligible for interview.
ii. Movers between entities
Respondents moving between entities are followed for interview. The personal details of the respondent are passed between the statistical institutes and a new interviewer assigned in that entity.
iii. Movers into institutions
Although institutional addresses were not included in the original LSMS sample, wave 2 individuals who have subsequently moved into some institutions are followed. The definitions for which institutions are included are found in the Supervisor Instructions.
iv. Movers into the district of Brcko are followed for interview.
The quality of panel data relies heavily on gaining high re-interview rates. High levels of attrition, especially differential attrition between sub-groups in the sample, can lead to bias and reduce the quality of the data.
The response rates for wave 2 are shown in Tables 3 and 4 below. The level of cases that were unable to be traced is extremely low as are the whole household refusal or non-contact rates.
At Wave 2, 3007 households were issued for interview, 1681 for the Federation of Bosnia and Herzegovina (FBiH) and 1326 for the Republika Srpska (RS). As the panel survey design allows for new households to be created as individuals from the original households move away to form their own household, 3086 households were identified during fieldwork. Of these, 3050 were potentially eligible for interview. That is 36 households had either moved out of BiH or were deceased.
The response rates at Wave 2 were high. By international standards, the expected response rates at wave 2 of a panel survey would be in the region of 88%, so the BiH panel has performed extremely well compared to other national panels.
In total, 9,708 individuals including children under 15 were enumerated within the sample households. Within the 3,050 interviewed households, 8060 individuals aged 15 or over were eligible for interview with 7527 (93.4%) being successfully interviewed in total, 209 of whom were new entrants to the survey at Wave 2. The household response rate for responding households was therefore high
Following data checking weights were produced for the wave 2 panel data. A weight has been derived that should be used for all longitudinal analysis of wave 2 data (i.e. analysis that requires data from both waves 1 and 2). It is called b_weight.
b_weight has been calculated as the product of two components, sel_wt and nrwt (i.e. b_weight = sel_wt x nrwt).
sel_wt is a weight to correct for the variation in selection probabilities. This accounts for BOTH the variation between municipalities in selection probabilities for the LSMS AND the variation between municipalities in the sub-sampling fractions for the panel.
nrwt is a weight to correct for differences between subgroups in response rates at wave 2, conditional upon response to wave 1. The subgroups were identified by fitting a segmentation model to predict response/non-response based on a set of 25 potential predictor variables. 28 subgroups (weighting classes) were identified, with individual-level response rates ranging from 56.9% to 100.0%.
The non-response analysis was based upon the 9325 persons who were not new entrants at wave 2 and were not known to be dead at wave 2. Of these, 8558 were respondents at wave 2 (91.8%). Thus, 8558 persons have a non-zero value of b_weight. For non-respondents and wave 2 new entrants, b_weight takes the value 0.
Further weights to be used for wave 2 cross-sectional analysis (i.e. when you want to include the wave 2 new entrants and only require data from wave 2) are being developed. Note: There are only 205 respondent new entrants at wave 2, so basing cross-sectional analysis on the other wave 2 respondents using b_weight should provide good estimates in the meanwhile.
Dates of collection
Mode of data collection
Data collection supervision
Quality Control checks by the FBSTA
Random checks were made by the FBSTA and interpreter to ensure the interviewers had called at addresses. These checks were made in Prijedor, Samac, Travnik, Novi Grad Sarajevo and Grude. The checks did not reveal any problems regarding calling at addresses. However, in two cases (out of 25) interviewers had reported taking direct interviews where, in fact, proxy information had been given. Supervisors were told to check carefully for this and to re-emphasise the importance of direct interviewing.
The major problem for panel surveys is attrition, that is, the loss of respondents who either refuse to take part any further in the survey, are unable to be contacted during fieldwork, or who move and cannot be traced. Attrition in panel surveys is potentially damaging as the sample size for respondents with complete longitudinal records reduces over time and there is a danger of differential attrition introducing bias. The following procedures were undertaken in an attempt to reduce attrition.
Tracing movers during fieldwork was undertaken. Interviewers were told during the training to try all methods possible to find movers. When households or individuals could not be found by the interviewer or supervisor a Movers Form was completed and sent to the BHAS. From that point the BHAS, in particular Jelena Miovcic, was responsible for finding any households or individuals. The most effective method for tracing movers was use their name and search for a phone number and address in the telephone directory. If a match was found then the households/individuals were contacted by telephone and date of birth details checked to confirm that the correct households or individuals had been found. This process was easier in FBiH than in the RS because the website of the phone directory (www.imenik.telecom.ba) with the list of fixed and mobile phones subscribers was available as well the addresses in cantons. In cases where households or individuals had moved to an unknown canton, it was possible to search all cantons. Elections had taken place during the same period that Living in BiH was undertaken. It had been hoped that the polling committee had updated data concerning current address of all voters. Therefore the BHAS made an official request to gain information about movers, however this request was not successful.
Once the address was found it was written onto the Control Form and given to the Supervisor to give to the appropriate (i.e. nearest) Interviewer. Of the 52 forms returned to BHAS , 38% of the new addresses were found and returned to the field. While 56% (29 households) could not be found. The remaining 6% were sent to BHAS but were eventually found by interviewers. By the end of fieldwork only 1.2% of total issued households were finally coded as “untraceable”.
It may have been possible to find further movers if personal ID numbers had been collected. However, it relation to data confidentiality and linkage it was decided not to collect this information.
Approximately 80% of the questionnaire is based on the LSMS questionnaire, carrying forward core measures that are needed to measure change over time. There are also some additional items that were requested to be included to link with other DFID projects (the Qualitative Studies). The questionnaire was circulated to the Data User Group (DUG) and changes were made as a result of comments received.
Pretest briefings were undertaken on 21 June 2002 in Banja Luka, and on 24 June 2002 in Sarajevo. Three interviewers who had previously worked on the LSMS and all members of the SIG attended each briefing. The pretest sample consisted of 30 LSMS households who were not going to be selected for wave 2 (chosen in a non-random way). To test the questions, interviewers completed a Rating Form and a Debriefing Form capturing structured questions on how respondents reacted to the survey (did it seem too long, were they worried about confidentiality etc). The debriefings were held on 1 and 2 July 2002 in Banja Luka and Sarajevo respectively.
In relation to the questionnaire, the pretest went very smoothly with very few recommended changes to the questionnaire and no refusals from respondents. However procedures’ regarding movers were not tested as none of the sample members had moved during the year.
The pretest identified an average interview length of 43 minutes (34 cases). There is evidence that over-burdening respondents with very long questionnaires on a panel survey can lead to higher levels of non-response and attrition. The aim was to have an average interview length of 45 minutes. Following the pretest some questions were removed and a few added to keep the overall length about the same.
Issues arising from the pretest
Falsifying information: The pre-test found one rogue interviewer had falsified some LSMS interviews. This has not been found with any other interviewers during the panel fieldwork so it is not problematic. This has been verified through the quality control back-checks implemented for the panel.
Proxy information: In several cases all members of the household were interviewed at the same time, with much of the data taken by proxy rather than through direct interviewing. Therefore, it was emphasised during the Supervisor and Interviewer training that direct interviews must be achieved. A payment scheme to reward interviewers who took direct interviews with all household members was introduced for the main survey.
Consumption module: Prior to the pretest the World Bank made a case for module 11 from the LSMS (consumption) to be included in the questionnaire. Module 11 was given to the pretest interviewers, with time boxes, to test. Interviewers did not react to it very well and 2 out of 25 households refused to complete it. On average it took 34 minutes (22 cases) to complete. Based on its time consuming nature it was decided that it should not be included because of concerns about over-burdening respondents in the vital second wave. This module, possibly shortened, will be reconsidered for inclusion at wave 3.
CSPro was the chosen data entry software. This was the software used for the LSMS and considerable skill in programming this software had been acquired by some SIG members. The CSPro program consists of two main features to reduce to number of keying errors and to reduce the editing required following data entry:
- Data entry screens that included all skip patterns.
- Range checks for each question (allowing three exceptions for inappropriate, don't know and missing codes).
Unlike the LSMS, where data entry was carried out simultaneously in the field, interviewers delivered their completed questionnaires to the Field Office in Banja Luka or Sarajevo for data entry. Ten computer staff were engaged in each Field Office to enter all questionnaires and Control Forms Two, one day, training events were held on October 3rd and 4th in the Chamber of Commerce in Banja Luka. Training was conducted by Fahrudin Memic, Donald Prohaska, Dario Lozancic and Vladan Sibinovic. A short introduction to the survey was delivered by the FBSTA.
Actual questionnaires returned from the field were entered by the DE operators during training. In this way it was possible to fine-tune the program and identify any problems with data entry personnel.
Data entry was completed by December 2003. A mission from December 8-13 was undertaken by Heather Laurie and Fran Williams (ISER) to identify what level of cleaning was required. A further mission undertaken by Fran Williams from 16-22 March examined what data cleaning had been carried out and what was yet to be completed. Fran Williams has completed substantial cleaning work and a clean version of data was ready by June 2003.
Individual level identifiers have been attached to all members of the wave 1 LSMS households selected for the panel sample. There is a household level identifier (IDD) for the issued household and each member of that household has a person number (ID) within the household. The household level identifier is needed for each wave but does not necessarily need to be related to the previous wave identifier for a given household. Households change in composition over time, making the notion of a core household that endures over time problematic for a panel.
In addition to these wave specific household and person number identifiers, each sample member has a unique personal identifier (LID) attached to them. This identifier is the unique number that each sample member carries with them throughout the life of the panel, even if they move between different households. This is the key linking identifier to be used in analysis when matching together data for the same individual from different waves of the survey and is a critical variable.
n receiving these data it is recognized that the data are supplied for use within your organization, and you agree to the following stipulations as conditions for the use of the data:
1. The data are supplied solely for the use described in this form and will not be made available to other organizations or individuals. Other organizations or individuals may request the data directly.
2. Three copies of all publications, conference papers, or other research reports based entirely or in part upon the requested data will be supplied to:
Department for International Development
Sarajevo, Bosnia and Herzegovina
The World Bank
Development Economics Research Group
LSMS Database Administrator
1818 H Street, NW
Washington, DC 20433, USA
tel: (202) 473-9041
fax: (202) 522-1153
3. The researcher will refer to the 2002 Living in Bosnia and Herzegovina Survey as the source of the information in all publications, conference papers, and manuscripts and will credit DFID, the Agency for Statsitics of Bosnia and Herzegovina, the Federal Office of Statistics and the Republika Srpska Institute of Statistics as the organizations that collected the data. At the same time the statistical institutions of Bosnia and Herzegovina are not responsable for the estimations reported by the analyst(s).
4. Users who download the data may not pass the data to third parties.
5. The database cannot be used for commercial ends, nor can it be sold.
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses