The Kyrgyzstan Multipurpose Poverty Survey (KMPS) was designed to be a nationally representative survey capable of measuring the standard of living in the Kyrgyz Republic2 during the second half of 1993.
While the KMPS is based on the LSMS framework, it has some features which distinguish it from the standard LSMS; in particular it collects extensive nutrition data.
The tradition of survey research in countries of the Former Soviet Union is not particularly strong. In the Kyrgyz Republic, the GOSKOMSTAT family budget surveys were not representative of the population in general, and the poor in particular. These surveys tended to focus on persons who work in enterprises and, to a lesser extent, pensioners. The KMPS represents a significant increase in the data available, and is a more suitable tool for monitoring the social and economic changes occurring in the Kyrgyz Republic.
The 1993 KMPS was carried out under the direction of researchers from the University of North Carolina at Chapel Hill, Paragon Research International, Inc., and the Institute of Sociology of the Russian Academy of Sciences.
The government of the Kyrgyz Republic has established an open access policy in regards to the data collected in the KMPS. The potential uses of this data set are quite broad given the multi-topic nature of the data and the fact that it was carried out at the national level.
Kind of data
Sample survey data [ssd]
Unit of analysis
Producers and sponsors
National Statistical Committee (NATSTATCOM)
The World Bank
The sample is designed to be fully representative of all households in the Kyrgyz Republic in the second half of 1993. Stratification was based on information on the population provided in the 1989 Census (since results from the 1994 microcensus were not available at the time of the survey).
According to the 1989 Census, there were about 856,000 families and 4,258,000 individuals living in the Kyrgyz Republic at that time (an average of about five members per family). Though the definition of 'household' used in the KMPS differs from the Census definition of 'family', this figure provided an estimate of the number of households from which the sample was to be drawn. Note that the sampling methodology assumes that any growth in the number of households since 1989 was equally distributed across regions.
A stratified, multi-stage sampling procedure was used, with the number of stages dependent on whether households were being drawn from urban or rural areas. [Note: Formally, the unit of selection was the dwelling, not the household. This is because the survey team only had available a list of dwellings and, in the case of multiple households living within the same dwelling, it was generally not possible to identify the different households prior to drawing the sample. In the cases of multiple households, interviewers were given instructions on how to select one household for interviewing (these instructions are described in sub-section 3.4 below). In a few cases, interviewers had to randomly select one household to interview from the several households residing within the dwelling. However, this was so uncommon that the survey team felt justified in leaving the dwelling out of the stages. Further, when in advance of drawing the sample the survey team was able to identify several households living in a particular dwelling, the households were listed separately before using systematic sampling. Thus, the survey is not unambiguously a sample of dwellings either.]
The formation of strata
The Kyrgyz Republic is divided into 6 oblasts. These oblasts are further divided into 57 raions which fall into two broad categories: 40 county-like territories and 17 relatively large cities or sections of cities which are under the direct jurisdiction of the oblasts rather than the raions in which they are located. A total of 21 strata were formed. These were of two types: self-representing (SR) strata (these consist of raions selected in the sample with certainty), and non-self representing (NSR) strata.
A total of 14 SR strata were selected. Twelve of these were cities or sections of cities which are so populous that at least some inhabitants would be expected to fall into any random sample of a given size (these are referred to as the 'urban SR strata').14 The urban SR strata were:
· the four raions of the capital, Bishkek (which is also the administrative center of Chuiskaya Oblast);
· the five other oblast administrative centers (each consisting of one raion): Dzhelal-Abad; Naryn; Talass; Osh; Balykchi;
· three other major cities (each consisting of one raion): Karakol (formerly Przheval'sk); Tokmak, and Kara-Balta.
The other two SR strata were the two raions Suzakskii and Kara-Suiskii which were selected with certainty for reasons outlined below (these are referred to as the two 'mixed urban-rural SR strata').
Non self-representing strata
Forty-five raions remained on the list after the selection of the 12 urban SR strata. Forty of these were territories raions and five were cities under the direct jurisdiction of the oblast in which they were located. The five cities (Uzgen, Tash-Kumyr,
Kyzyl-Kiya, Kara-Kul', and Mali-Sai) were combined with the territories in which they are geographically situated, thus increasing the heterogeneity of those raions. The second group of NSR strata was therefore selected from forty raions (some of which were combined with the five cities mentioned above). The NSR strata were identified on the basis of three characteristics: geographical conditions (mountains, valleys or a mix of the two); type of production (agriculture, industry or a mix of the two); and ethnic composition (Kyrgyz, mostly Kyrgyz and Uzbek; or mostly Kyrgyz and Russian-speaking). Of the 27 possible strata, six were formed:
I. mountains; agriculture and animal husbandry; predominantly Kyrgyz population.
II. mountains; agriculture, animal husbandry and nurseries; predominantly Kyrgyz population.
III. mountains; agriculture-industry, predominantly Kyrgyz and Uzbek population.
IV. valleys; agriculture; predominantly Kyrgyz and Russian-speaking population.
V. valleys and mountains; agriculture, predominantly Kyrgyz and with Uzbek population.
VI. valleys, agriculture-industry, predominantly Kyrgyz and Russian-speaking population.
Based on the 1989 Census, the household populations of strata II and V were about twice large as the household populations of the other strata. To ensure that all strata were proportionally represented, strata II and V were therefore both split into two, resulting in a total of eight strata (henceforth named using arabic numerals so as to distinguish them from the above). The survey team envisioned that stratum 7 would be a NSR stratum. However, as there were only two raions (Suzakskii and Kara-Suiskii) in this stratum, both of which were therefore chosen with certainty. Therefore stratum 7 technically became two separate SR strata (7a and 7b), with each strata containing a single raion (these are referred to as the mixed urban-rural SR strata). Although these raions technically were SR strata, they were treated in the sampling process as if they were NSR strata (for example, in the method that households were selected from them). More details on this are presented below.
The selection of primary sampling units
The nature of the primary sampling units (PSU) differed according to whether they came from SR or NSR strata. In the urban SR strata the PSU were microcensus 'enumeration districts' (ED).15 Based on the 1989 Census, each microcensus ED was expected to contain about 414 individuals (less than 100 households). It was considered appropriate to choose eight to ten households from a given microcensus ED and therefore enough were selected to yield the desired number of urban households from the particular stratum. The districts were chosen with equal probability and no substitution was permitted.
In the seven NSR strata, the PSU were raions. Two were selected from each stratum with probability proportional to size (PPS), as measured by reported households in the 1989 Census. As mentioned above, strata 7a and 7b were treated in the sampling process as NSR strata. In this sense their PSU were the raions themselves.
The selection of secondary sampling units
The selection of secondary sampling units (SSU) differed depending on whether the PSU was drawn from a SR or NSR stratum.
SSU within selected PSU from urban SR strata
The SSU selected from microcensus ED in the 12 urban SR strata were the households (these were also the last stage sampling units for these strata).
SSU within selected PSU from NSR strata
Within the raions selected as PSU from NSR strata (and also the mixed urban-rural SR strata), 'settlements' (or areas where people are living) were classified as gorodskoi (urban) or rural. The number of urban settlements within a raion generally did not exceed two or three.
SSU selected from urban settlements
It should be emphasized that urban settlements were not the SSU; urban settlements were selected, and then the SSU were selected from these settlements. If there was only one urban settlement in a raion, then it was selected. If there was more than one urban settlement, then one was selected using PPS for each 15 urban households required from the raion. There were seven raions selected as PSU from NSR strata that did not contain urban settlements, even though they represented strata in which there were urban settlements (these raions are indicated in Table 6 by the asterisks against the target number of urban households to be sampled). This problem (involving 78 of the 2,100 target households) arose because the number of urban households to be sampled from a particular stratum was calculated taking into account the Census information on household populations in all of the raions within the stratum, not just those that were selected as PSU.
The problem was rectified using substitution. In strata 3, 5, and 6, only one of the two selected raions had urban settlements. Consequently all of the required urban settlements were drawn from that particular raion. For example, in stratum 3 the 7 households which were to be drawn from Ak-Suiskii raion were instead drawn from Naukatskii raion. In stratum 1, both selected raions did not contain urban settlements, the 32 target urban households were drawn from urban settlements in Toktogul'skii raion. In stratum 2, where both selected raions did not contain urban settlements, the 4 target households were drawn from the city Talass, which was already in the sample as a SR stratum.
The SSU selected from urban settlements were microcensus ED. The microcensus ED were selected in the same manner as described in sub-section 3.2 above.
SSU selected from rural settlements
In rural areas, villages were the SSUs. Effort was made to ensure that the ethnic composition of villages was properly represented in the sample. Within each selected PSU, data from the 1989 Census was used to group villages by ethnicity. For example, in stratum 1 (in 1989), 94.5 per cent of villages were Kyrgyz; 3 per cent were Russian; 0.8 per cent were Uzbek; 1.7 per cent were other. The number of households chosen from each group of villages was made proportional to the number of villages of each type in the stratum. The selection of individual villages from the ethnic groupings was random and no more than 18 to 20 households were selected from a given village. Note that quotas were not used, so the exact distribution of households by ethnicity was not guaranteed at this level.
The selection of last stage sampling units (households)
For all strata, the last stage sampling unit was the household19, with the households being drawn randomly from the selected SSUs. Interviewers were given a list of addresses and the names of the person responsible for each dwelling (akin to a lessee). The interviewer was obliged to interview the household of that person at that address. If the person responsible for the apartment did not reside there, but relatives did live there, the interviewer was obliged to interview the related household at the address. If no related household lived there, the interviewer was obliged to interview whatever household did live there. If more than one household occupied the dwelling, and that fact was not registered before selection, the interviewer was obliged to randomly select one household.
According to the 1989 Census, the 12 SR urban strata contained 34.2 per cent of households, and therefore 718 of the 2,100 households were drawn from them. The number of households drawn from each SR urban strata was proportional to its total population of households.
Distribution of households in selfrepresenting strata:
Dzhelal-Abad (DA): 39
Karakol (IK): 38
Balykchi (IK): 26
Osh (OSH): 112
Naryn (N): 20
Talass (T): 19
Tokmak (CHU): 55
Kara-Balta (CHU): 38
Leninskii raion: 79
Oktibrskii raion: 100
Pervomaiskii raion: 84
Sverlovskii raion: 108
Total households selected from SR strata: 718
The remaining 1,385 of the target 2,100 households were selected from the NSR strata and also the two mixed urban-rural SR strata. The number of households selected from each of these strata was proportional to the total population of households within the stratum, and is indicated in parentheses beside the stratum title in Table 6. However, the number of households to be surveyed within each stratum was divided equally between the two raions selected as PSUs, even though the population of the raions differed.
Price survey sample procedures
This sub-section provides an outline of the sampling procedures used in price survey. For more details on the sampling procedure, refer to the document titled 'Instructions for the Price Survey'. The price survey was designed to provide a relatively inexpensive method of measuring the prices to which the households in the sample were exposed. It was designed to provide covariates (contextual variables) for the household survey. The sample was not weighted to represent the price of a foodbasket in the Kyrgyz Republic. If there existed more than store within each classification of 'potential' stores (see Form C in the Survey), the reporter was required to select the store at which people would be most likely to shop. This was not determined rigorously; however since the sampling points were normally quite small in area, it was not difficult for the reporter to build an impression. As the sample points in many cases were either quite small or were rural, often there did not exist stores of certain classifications. If in a particular sample point there did not exist a store in a key classification (marked by an asterisk in Form C ), the reporter was obliged to search for such a store within an area one kilometer of the borders of the sample point.
Survey as implemented
As mentioned above, the sample was designed to be fully representative and therefore selfweighting.
The target household sample size was 2,000. To allow for an estimated non-response rate of about five percent, a sample of 2,100 households was drawn. The actual number of completed household interviews was 1,938, reflecting a non response rate of 7.7 per cent. The response rate for individuals is more difficult to calculate, since some household members (eg.students under 18 studying elsewhere) could not be interviewed.
The sample was designed to be self-weighting and therefore there are no weighting variables.
Dates of collection
Mode of data collection
The KMPS consists of five components: a household questionnaire; an adult questionnaire; a child questionnaire; a food price and availability survey; and a survey on community and social infrastructure.
The household questionnaire was administered to the person who best knew the business and concerns of the family, its income and expenditures, and the health of all its members. This respondent may not necessarily have been the head of the household, however the household questionnaire was not to be administered to a child.
Identification data (module A)
This module records the raion and settlement in which the household is situated, a unique household identification number, the date of the interview and its duration, and identifies the interviewer. The unique household identification number, HID, is constructed: HID = AA2´1000+AA3, where AA2 is the settlement identifier and AA3 is a number ranging from 1 to the total number of households sampled from a particular settlement.
Household composition (module B)
This module presents a household roster which is designed to collect basic demographic information on members of the household and establishes the relationship between them. A household was defined to include people who reside in the given living quarters, share income and expenditures, and conduct housekeeping together. In determining who was in a particular household, the exact familial relationship between people was irrelevant. Children under 18, unmarried and living elsewhere as students were considered members of the household. Children 18 years and older who were not living with the family were not considered members of the household (even if such people were materially helped by the household). In the individual questionnaires, information is only collected for household members.
Housing (module C)
This module collects information on the type, construction, size and ownership of the housing unit and how long the family has been residing there. It establishes the presence of amenities such as electricity, centralized heating and water supply, sanitation and telephone. Information is collected on other forms of housing owned by the family. The module also establishes the presence and approximate value of consumer durables such as refrigerators, washing machines, televisions, autos or trucks and carpets, and whether or not any of these items were sold in the last twelve months.
Agriculture and animal husbandry (module D)
This module establishes whether the household had the use of land for farming and animal husbandry, and if so, how much land was available and what was the ownership situation. The module considers three aspects of farming and animal husbandry:
· The home production of crops (including vegetables, fruit, grains and tobacco) over the past 12 months is recorded. There are details on the quantity produced, sold, consumed by the family and given free to relatives and others. There is no information on the value of sales of each item, but the overall value of crop production sold in the past 30 days is recorded.
· Information is recorded on the ownership of cattle, pigs, sheep, goats, horses, poultry, rabbits and bee hives. For each type of animal, the module records current numbers owned, changes in numbers over the last 12 months (and reasons for this), sales, and the respondent's estimate of the current market price of the animal.
· Details of the home production of meat, poultry, milk, eggs, honey, wool and pelts over the previous 12 months are provided. There is information on the quantity produced (and what it could currently be sold for), sold, consumed by the family and given free to relatives and others. The overall earnings from sales of animal products in the last 30 days is also recorded.
Expenditures (module E)
This module records household expenditures under four separate reference periods:
· 7 day reference period: Details on the quantity purchased and amount paid for 68 foods are recorded. For important foods (bread, meat, milk, eggs, potatoes and rice) there is a record of the quantity purchased from different sources (state store, cooperative or private store). The amount spent on eating out in the past 7 days is recorded.
· 30 day reference period: This reference period includes information on the amount spent on medicine, fuels, services (eg. public and private transportation, repair work of clothing, furniture and appliances), rental for housing and utilities (cold and hot water, heating and power). There is also data collected on miscellaneous expenses such as tuition fees, other medical treatment (excluding medicine), purchases of financial assets and other financial transactions. The value of either monetary or in-kind gifts to relatives and others is also recorded.
· 3 month reference period: Expenditures on clothing and footwear are recorded.
· 12 month reference period: Expenditures on household appliances, transportation, housing and furnishings are recorded.
Income (module F)
This module records all sources and amounts of income earned by the household over the past 30 days as well as total income earned. Where the income received was in the form of a benefit or in-kind, the household was requested to estimate the monetary value. There is information on the following income sources:
· The total amount of wage income earned by the household and also income from the sale of products from a private land plot or farm were recorded. However there is more detailed information on these income sources in the adult questionnaire and module D of the household questionnaire respectively, so this section is mainly useful for crosschecking purposes.
· Subsidies from employers and local authorities (eg. allowances for vacation, nursery school fees, food, public transport, medical treatment, housing) were recorded. The household was asked whether it received any fuel subsidies, but the value of these was not recorded.4
· Childcare allowances (one-time childbirth benefits, childcare benefits and single mothers' childcare benefit) were recorded and if a household was eligible for such benefits but did not receive them, then the reason for this was established.
· Gifts or charity from persons outside the household (relatives, friends, religious groups, international organizations, other organizations or private individuals) were recorded.
· Income from other sources (pensions, stipends, sickness pay, unemployment benefits, sales of the products of individual labor activity, sales of private belongings, rental property, invested capital, insurance payments, alimony payments and changes in financial assets) were recorded.
Interviewer remarks (module G)
This section records the interviewer's opinions on the success of the interview and likely accuracy of the data collected.
The adult individual questionnaire was administered personally to every member of the household 14 years and older, preferably privately. Interviewers were not permitted to fill out an adult questionnaire based on answers provided by another member of the household.
Identification data (module H)
This module records the same information provided in module A in the household questionnaire. In addition to basic demographic information, it records the household identification number, HID (=A1H3), and the position of the individual on the household roster, PID (=A1H4).
Migration (module I)
This module records information on:
· the birthplace of the respondent and, if applicable, the place he/she resided before moving to the current area of residence
· residential permits
· language used at home and by parents
· education of parents
Labor (module J)
This module has information on:
· Primary employment: This section records information on the primary job of the respondent if it involves working in an enterprise, organization, collective or state farm, or cooperative. There are details on the respondent's occupation, primary duties, ownership of place of work, and payment in last 30 days (less deductions). There is also information on hours of work in the last 7 days and whether the respondent worked more or less than usual in the last 7 days (and reasons for which). Job satisfaction and willingness to retrain are also recorded.
· Secondary employment: If the respondent held an additional paid job, this section details the type of enterprise and amount paid in the last 30 days (less deductions).
· Entrepreneurial activity: This section records information on businesses owned (or partowned) by the respondent. For businesses producing goods, it records what is produced and the value of finished goods and expenditures in the past 30 days. For businesses involved in trade operations, it records what was traded, whether goods were bought abroad (and from where), the value of goods sold and bought in the past 30 days and expenditures over the past 30 days. For businesses rendering services it records the type of service rendered and the value of receipts and expenditures over the past 30 days. For all types of business, there is also information on the percentage of the business owned by the respondent, who else owns the business, the value of business assets, the number of employees (both household members and not) and profit received over the past 30 days.
· Other work: This section records income earned from any other work other than what was mentioned in the sections above.
· Current well-being: The respondent is asked for his/her perception about his/her current economic situation and the prospects for the future.
· Education: This section measures the years of 'general secondary education' as well as completion of specialized vocational, secondary, and higher education.
· Pensioners: Pensioners are asked the type and amount of pension received in the last 30 days.
· Unemployed and inactive: This section is aimed at identifying and recording information on those who are either unemployed or who are not in the labor force. It is possible to establish the duration of unemployment as well as how long a respondent has been out of the labor force (and the reason why the respondent is not in the labor force). Discouraged jobseekers (not actively seeking work but would like to work) can be identified. For those actively seeking work, there is information on how the person has sought work, usage of the government employment service, attitude to retraining and receipt of unemployment benefits.
· Summary questions: Module J also provides a summary question on the total income earned in the past 30 days from all sources and the respondent's main occupation at the present.
Health data in the KMPS
There are five modules which collect information on health issues. The nature of the health data collected, and the way it was collected is one way in which the KMPS differs from the usual LSMS. The following five sub-sections briefly summarize the data collected in the health related modules of the adult questionnaire.
Morbidity and use of medical facilities (module L)
The information in this module is provided by the respondent. The respondent was asked to describe any medical problems over the past 30 days and whether medical attention was sought. If the respondent saw a doctor in the last 30 days5 there is information on the type of medical attention (visit to doctor or home visit), and its cost. If the respondent was hospitalized in the past 30 days, there is information on the cost of treatment, including medicine. There is also information on availability of medicine, usage and cost of preventative care, and number of days missed from work or school because of illness.
Self-reported health evaluation (module M)
The information in this module is provided by the respondent. The respondent was asked for his/her height, weight, and perception of health and state of mind, and ability to work and perform daily activities. The respondent was also asked whether he/she had any difficulties performing a number of activities such as walking, running, lifting, eating and dressing. For those respondents with health problems which affected their ability to perform day to day tasks, information was collected on who provided care and help. Information was collected on the existence and treatment of health problems such as diabetes, miocardial infarction and cerebral hemorrhage, and eyesight and hearing problems. Information on the respondent's usage of tea, coffee, tobacco and alcohol was also collected in this module.
Anthropometric measurements (module Q)
The information in this module is gathered by the interviewer. In this module data was collected on the respondents height, weight, hip and waist circumference and also whether he/she had any amputated limbs. If the interviewer did not have any medical training, then the data was collected by trained medical personnel.
Questions for women (module N)
The information in this module is provided by the respondent. Female respondents answered questions about their experience with pregnancy, childbirth, abortion, and birth control.
Nutrition (module P)
The information in this module is provided by the respondent. The respondent was asked to reconstruct from memory what food was consumed in either the preceding 24-hours or during the previous day. The interviewer asked questions to help the respondent remember what was eaten; from the answers of the respondent the interviewer assessed the type, quality and quantity of consumed food. To help evaluate the quantity of food consumed, a 'food album' with pictures of various portions of food products and dishes (in actual size) were used.6 In addition, quantities of foods were in units familiar to the person being questioned (for example, cups, glasses, platefuls and spoonfuls). The food albums could also be shown at the end of the questioning to help the respondent recall food which perhaps he or she had forgotten.
Time use (module O)
This module asked respondents to estimate time spent on different activities (and, if relevant, time spent commuting to them) over the previous seven days (not including the day of the interview). Information was collected on the following activities: working (including work at an enterprise/organization and home, entrepreneurial activity, farming, and individual labor activity); work on the garden at home, dacha or garden plot; studies; shopping for food and non-food items; obtaining household services (laundry, tailor etc.); other home duties including cooking, washing dishes and cleaning; caring for children and other relatives; sleeping and recreational activities.
The child individual questionnaire was completed for every member of the household under the age of 14 years. The questionnaire was administered to the adult member of the household who was responsible for caring for the child. Modules I, O, Q and R collect information to similar their counterpart modules in the adult questionnaire. Module H collects the same information as the counterpart module in the adult questionnaire, except there is an additional variable (A1H11) which identifies the adult member of the household who answered the questions on behalf of the child.
Child care (module K)
This module has information about the level of education of the child and if the child currently attends school, the cost of fees and textbooks are recorded. If applicable, there is information on the reason(s) why the child does not currently attend school. There is information on whether the child has missed school during the past year because of agricultural work commitments, and if applicable, how much school was missed. The module also has information on whether the child has been cared for by relatives who are not members of the household and, if so, on how many days in the last week did this occur (and the average number of hours per day). Similar questions are asked regarding those children who attended kindergarten, nursery school, or the like.
Morbidity and use of medical facilities (module L)
This module asks the same questions as its counterpart in the adult questionnaire and collects additional information vaccinations received by the child, their cost and, if applicable, reasons for not receiving them.
Health evaluation (module M)
This module asks the respondent the child's height and weight and for an assessment of the child's physical and mental health. Data is collected on the presence and treatment of diabetes and the presence of medical conditions such as head cold, sore throat, diarrhoea or other irregularities in defecation and leukemia. There is also information on the child's consumption of tea and coffee.
Nutrition (module P)
This model evaluates the food consumption of the child using the same techniques used in the nutrition module of the adult questionnaire. For those children attending school or nursery school, interviewers were instructed to additionally question the person(s) with knowledge of the child's food intake at that institution (for example, teacher, day care worker or school cafeteria worker). It should be noted that the KMPS does not contain any information on breast-feeding.
SURVEY OF AVAILABILITY AND PRICES OF FOOD PRODUCTS AND FUEL
This survey contains three sections of information relating to retail outlets selling food products in the 'local area'8 of the households participating in the survey. The local area of the households was determined by the following method:
· housewives from the households participating in the survey were questioned, and from this a preliminary list of retail outlets was constructed.
· from this list of frequented retail outlets, a list of all streets and alleys within walking distance was constructed.
· the observer then walked down these listed streets and alleys and constructed a complete list of all retail outlets.
The survey includes preliminary identification data (from the cover page of the questionnaire) and details the raion, settlement identifier, census enumeration district, date of survey and also the name of the person conducting the survey. The sections contained in the survey are:
· Form A: List of all retail outlets in the neighborhood selling food, drinks and tobacco products. This section has data on up to 20 retail outlets. There is information on location, type, hours of operation, number of employees, type of ownership, and goods sold. There are seven types of sales site: general food stores and specialized food stores selling milk and milk products; bread; meat, fish and poultry; fruits and vegetables; alcohol; tobacco products. There are three ownership classifications: state owned; nonstate, cooperative, commercial etc.; and private owned.
· Form B: List of all retail outlets in the neighborhood selling fuel: This section has data on up to 6 retail outlets. There is information on location, type of ownership (same classifications as above) and type of fuel sold (gasoline, coal, wood, diesel fuel, kerosene).
· Form C: Classification of retail outlets in the neighborhood selling food, drinks and tobacco products. This section provides a table in which each trade site listed in Form A is classified by type of ownership and products sold. The purpose of this form is to help the reporter identify which trade sites are to be used in the compilation of data on product availability and prices (see form E).
· Form D: Classification of retail outlets in the neighborhood selling fuel. This section provides a table in which each trade site listed in Form B is classified by type of ownership and products sold. The purpose of this form is to help the reporter identify which trade sites are to be used in the compilation of data on product availability and prices.
· Form E: Availability and price of food products in different stores in the neighborhood. This section has 14 parts each part covering a different combination of type of sales site and type of ownership. Information is collected on 99 products from general grocery stores; 15 products from milk stores; 15 products from bread stores; 21 products from meat, fish and poultry stores; 26 products from vegetable and fruit stores; 8 products from alcohol stores; and 3 products from tobacco stores. The section contains information on the availability of the different products and the prices of the cheapest and most expensive types or brands of each product. The information is recorded for only one store in each classification. This store was not chosen randomly; reporters were given the freedom to find the most prominent or well-stocked stores in a particular classification.
SURVEY OF COMMUNITY AND SOCIAL INFRASTRUCTURE
Basic information on community services, infrastructure and economic structure were collected in this survey. A community, or 'immediate place of residence' is defined as the microcensus enumeration district in urban areas and the settlement (village) in rural areas. The survey includes preliminary identification data and details the raion, settlement identifier, microcensus enumeration district, date of survey and also the name of the person conducting the survey. In addition, there is information enabling the community data to be linked to the household data. The information collected on communities where sampled households live can be grouped as
· Population and area. For urban communities, information was also gathered on the population and area of the entire urban area (or settlement) where the community is located.
· Rights to use of land for personal and commercial purposes.
· Distance to raion and oblast centers and the nearest big city.
· Existing types of housing and types of housing available for purchase by private individuals.
· Transportation and communication infrastructure. Specifically, data were collected on: roads; telegraph, telephone, television and postal services; newspaper service; public libraries; recreational centers; and public transport.
· Presence of social service facilities such as public health facilities, schools and social welfare offices.
· Restaurants and other public eating places.
· Labor markets and Employment Service Offices. Specifically, data were collected on types of occupations available (with monthly salary), existence of Employment Service Office and whether or not any state enterprises have recently been shut down.
· Presence of services such as: banks, police, fire brigade.
· Existence of social infrastructure such as: sources of water, sanitation, electricity
The local supervisors were required to examine the questionnaires to locate problems which could be remedied in the field. Such problems included missing key demographic information and problem with household and individual identification numbers. All questionnaires were then sent to Bishkek, where they were again checked for identification number problems and then to Moscow, where yet another ID check was performed.
Open-ended questions (eg. occupation and nationality questions) were not immediately coded. Instead, the responses were entered into the data set in text, to be coded at a later date. Codes for all open-ended questions except occupation were made available in mid- February. Occupation codes were made available in June 1994.
Data entry and verification of the household questionnaires was completed by a private data entry firm by January 25. All other data entry was handled in-house using the SPSS data program. The first entry of the 10,000 child and adult questionnaires began on December 20, 1993; the verification pass began on January 20 and was completed by February 2. Entry of the community and price surveys began in late January and was completed in two weeks.
Other forms of data appraisal
There was a fairly short time period for survey design, field work and provision the data.
After double-entry verification, outlier values were flagged by the cleaning program, and the original questionnaires were examined to determine whether there was a mistake. If there was some basis for changing the values (for example, it was obvious than an answer was recorded in grams rather than kilograms), the value was corrected. However, at the request of the World Bank, outliers were left in the data set (unless there was specific evidence of a mistake). Thus, the researcher is responsible for defining outliers and deciding how to treat them (see sub-section 6.4 below for discussion of outliers in the expenditure data).
Missing value codes
There are three missing value codes provided for each variable. For each variable, the number of digits in the missing value code will be equal to the maximum number of digits of a legitimate response for that question. This ensures that for continuous variables, the missing value code is not confused for a legitimate response. The missing value codes employed for each variable are:
· 'don't know' - the highest possible code ending in '7';
· 'refused' - the highest possible code ending in '8';
· 'missing' - the highest possible code ending in '9'.
For example, for all variables occupying fields of two columns, the three missing variable codes are 97, 98 and 99 respectively. For all variables occupying fields of five columns, the three missing value codes are 99997, 999998 and 99999 respectively. [Note that there was some difficulty in the data entry of long missing value codes. For example, in an eight-digit variable, a data-entry operator may have omitted one nine in the missing value code and entered the incorrect seven digit "don't know" code 9999997 instead of the proper eight-digit code, 99999997. Such a mistake could seriously affect data analysis since a seven-digit code would not be treated as a missing value if it was assumed that the code had eight digits, not seven. Another potential problem with the coding of missing values is the situation where missing value codes of differing lengths were entered for the same variable . An example of this is the variable measuring interview duration (hours) for the adult questionnaire, A1H8_1. There are 78 cases of a '9' being entered for this variable, and yet there are also two cases where a '99' has been entered. Normally, such mistakes were caught in the process of double-entry verification, however the researcher should be aware of these potential problems.]
It should be noted that the missing value code which ends in '9' was not printed in the questionnaires since it was used only in those situations when a codeable response was absent because of interviewer error. That is, a missing value code which ends in '9' is to be distinguished from a legitimate skip which occurred when, based on a previous response of the respondent, a particular question was not asked of the respondent. For example, when a
household states it did not purchase a particular food, the expenditure and quantity questions are skipped. A legitimate skip was coded as a '.'.
Specific data issues
This sub-section provides specific details relevant to using the data in the KMPS. To this date, the majority of the research using the KMPS has focused on the individual and household questionnaires and consequently, there is no specific information on the adequacy of the data in the survey of availability and prices of food products and fuel and the survey of community and social infrastructure
Children's nutrition and health data
Users should be aware of two aspects of the data relevant to assessing the nutrition and health status of children in the Kyrgyz Republic. First, the age of young children is reported only to the nearest year, rather than the nearest month. Hence, it is not possible to use the anthropometric data to compute height or weight for age according to international norms. Second, there is no information on breast-feeding.
Seasonality of the data
As the interviews were conducted just after the major harvest time, the production figures for agricultural and animal husbandry will be higher than at other times during the year. Also, the estimates of expenditure on heating will similarly be lower because of seasonal factors.
Non-response of individuals
An individual questionnaire was not completed for every household member, with the overall non-response rate for individuals being 5 percent. The non-response rate varied substantially between regions, with Narunskaya oblast having the highest non-response rate of 14.6 percent and Talasskaya oblast having a non-response rate of only 0.4 percent.
A problem with the adult questionnaire is that there is no question asking for hours worked at additional job (although question A1J33 records earnings at additional job). This may not be an important omission as, according to question A1J29, only 2.1 per cent of those working in an enterprise, collective farm or state farm (A1J1=1) are working such additional paid jobs. Another aspect of the labor data that should be taken into account by researchers is the fact that question A1J100 asks unemployed or inactive individuals whether they have tried to find work in the last 30 days. Since the standard definition of unemployment (used by, for example, the ILO) includes people looking for work in the last seven days, the above use of a 30 day reference period may lead to overestimation of the unemployed and underestimation of the inactive compared with this standard method.
Warning on the use of missing value codes
The following problems with missing value codes in the household questionnaire have been identified:
· for variable AC20, the missing value codes are 999997, 999998, and 999999, but there are valid answers higher than 1,000,000.
· for variable AC41, 95 and 96 are also missing value codes.
· for variable AD141_2E the missing value codes are 99997, 99998, and 99999 even though for all other variables in that section the missing value codes are 999997, 999998, and 999999.
· for variable A1J41, 999996 is also a missing value code.
· for the household with HID=51014 the missing value code for AF14_2B is 999997.02.
World Bank LSMS
In receiving these data it is recognized that the data are supplied for use within your organization, and you agree to the following stipulations as conditions for the use of the data:
1. The data are supplied solely for the use described in this form and will not be made available to other organizations or individuals. Other organizations or individuals may request the data directly.
2. Three copies of all publications, conference papers, or other research reports based entirely or in part upon the requested data will be supplied to:
National Statistical Committee of the Kyrgyz Republic
374 Frunze Street Bishkek,
Kyrgyz Republic 720033
The World Bank Development Economics Research Group
LSMS Database Administrator
MSN MC3-306 1818 H Street, NW Washington, DC 20433, USA
tel: (202) 473-9041
fax: (202) 522-1153
3. The researcher will refer to the 1993 Kyrgyz Republic Multipurpose Poverty Survey as the source of the information in all publications, conference papers, and manuscripts. At the same time, the National Statistical Committee of the Kyrgyz Republic is not responsable for the estimations reported by the analyst(s).
4. Users who download the data may not pass the data to third parties.
5. The database cannot be used for commercial ends, nor can it be sold.
Use of the dataset must be acknowledged by including a citation which would include:
- Identification of the Primary Investigator
- Title of the survey (including the year of implementation)
- Survey reference number
- Source and date of download
Kyrgyz Republic National Statistical Committee. Multipurpose Poverty Survey (KMPS) 1993. Ref. KGZ_1993_KMPS_v01_M. Dataset downloaded from www.microdata.worldbank.org on [date]
Disclaimer and copyrights
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.