Survey ID Number
IDN_1993_IFLS_v01_M
Title
Family Life Survey 1993, IFLS1 / SAKERTI 93
Sampling Procedure
The Household Survey Sampling Procedure
The household survey component of the 1993 IFLS was designed to collect contemporaneous and retrospective information on a wide array of family life topics for a representative sample of the Indonesian population. In IFLS1 it was determined to be too costly to interview all household members, so a sampling scheme was used to randomly select several members within a household to provide detailed individual information. IFLS1 conducted detailed interviews with the following household members:
- the household head and his/her spouse
- two randomly selected children of the head and spouse age 0 to 14
- an individual age 50 or older and his/her spouse, randomly selected from remaining members, and
- for a randomly selected 25% of the households, an individual age 15 to 49 and his/her spouse, randomly selected from remaining members.
Household Selection
The IFLS sampling scheme stratified on provinces, then randomly sampled within provinces. Provinces were selected to maximize representation of the population, capture the cultural and socioeconomic diversity of Indonesia, and be cost effective given the size and terrain of the country. The far eastern provinces of East Nusa Tenggara, East Timor, Maluku and Irian Jaya were readily excluded due to the high costs of preparing for and conducting fieldwork in these more remote provinces. Aceh, Sumatra's most northern province, was deleted out of concern for the area's political violence and the potential risk to interviewers. Finally, due to their relatively higher survey costs, we omitted three provinces on each of the major islands of Sumatra (Riau, Jambi, and Bengkulu), Kalimantan (West, Central, East), and Sulawesi (North, Central, Southeast). The resulting sample consists of 13 of Indonesia's 27 provinces: four on Sumatra (North Sumatra, West Sumatra, South Sumatra, and Lampung), all five of the Javanese provinces (DKI Jakarta, West Java, Central Java, DI Yogyakarta, and East Java), and four provinces covering the remaining major island groups (Bali, West Nusa Tenggara, South Kalimantan, and South Sulawesi). The resulting sample represents 83 percent of the Indonesian population. (see Figure 1.1 of the Overview and Field Report in External Documents). Table 2.1 of the same document shows the distribution of Indonesia's population across the 27 provinces, highlighting the 13 provinces included in the IFLS sample.
The IFLS randomly selected enumeration areas (EAs) within each of the 13 provinces. The EAs were chosen from a nationally representative sample frame used in the 1993 SUSENAS, a socioeconomic survey of about 60,000 households.The SUSENAS frame, designed by the Indonesian Central Bureau of Statistics (BPS), is based on the 1990 census.The IFLS was based on the SUSENAS sample because the BPS had recently listed and mapped each of the SUSENAS EAs (saving us time and money) and because supplementary EA-level information from the resulting 1993 SUSENAS sample could be matched to the IFLS-1 sample areas.Table 2.1 summarizes the distribution of the approximately 9,000 SUSENAS EAs included in the 13 provinces covered by the IFLS. The SUSENAS EAs each contain some 200 to 300 hundred households, although only a smaller area of about 60 to 70 households was listed by the BPS for purposes of the annual survey. Using the SUSENAS frame, the IFLS randomly selected 321 enumeration areas in the 13 provinces, over-sampling urban EAs and EAs in smaller provinces to facilitate urbanrural and Javanese-non-Javanese comparisons. A straight proportional sample would likely be dominated by Javanese, who comprise more than 50 percent of the population. A total of 7,730 households were sampled to obtain a final sample size goal of 7,000 completed households. Table 2.1 shows the sampling rates that applied to each province and the resulting distribution of EAs in total, and separately by urban and rural status. Within a selected EA, households were randomly selected by field teams based upon the 1993 SUSENAS listings obtained from regional offices of the BPS. A household was defined as a group of people whose members reside in the same dwelling and share food from the same cooking pot (the standard BPS definition). Twenty households were selected from each urban EA, while thirty households were selected from each rural EA. This strategy minimizes expensive travel between rural EAs and reduces intra-cluster correlation across urban households, which tend to be more similar to one another than do rural households. Table 2.2 (Overview and Field Report) shows the resulting sample of IFLS households by province, separately by completion status.
Selection of Respondents within Households
For each household selected, a representative member provided household-level demographic and economic information. In addition, several household members were randomly selected and asked to provide detailed individual information.
The Community Survey Sampling Procedure
The goal of the CFS was to collect information about the communities of respondents to the household questionnaire. The information was solicited in two ways. First, the village leader of each community was interviewed about a variety of aspects of village life (the content of this questionnaire is described in the next section). Information from the village leader was supplemented by interviewing the head of the village women's group, who was asked questions regarding the availability of health facilities and schools in the area, as well as more general questions about family health in the community. In addition to the information on community characteristics provided by the two representatives of the village leadership, we visited a sample of schools and health facilities, in which we conducted detailed interviews regarding the institution's activities.
A priori we wanted data on the major sources of outpatient health care, public and private, and on elementary, junior secondary, and senior secondary schools. We defined eight strata of facilities/institutions from which we wanted data. Different types of health providers make up five of the strata, while schools account for the other three. The five strata of health care providers are: government health centers and subcenters (puskesmas, puskesmas pembantu); private doctors and clinics (praktek umum/klinik); the private practices of midwives, nurses, and paramedics (perawats, bidans, paramedis, mantri); traditional practitioners (dukun, sinshe, tabib, orang pintar); and community health posts (posyandu, PPKBD).The three strata of schools are elementary, junior secondary, and senior secondary. Private, public, religious, vocational, and general schools are all eligible as long as they provide schooling at one of the three levels.
Our protocol for selecting specific schools and health facilities for detailed interview reflects our desire that selected facilities represent the facilities available to members of the communities from which household survey respondents were drawn. For that reason we were hesitant to select facilities based solely either on information from the village leader or on proximity to the village center. The option we selected instead was to sample schools and health care providers from lists provided by respondents to the household survey.
For each enumeration area lists of facilities in each of the eight strata were constructed by compiling information provided by the household regarding the names and locations of facilities the household respondent either knew about or used. To generate lists of relevant health and family planning facilities, the CFS drew on two pieces of information from the household survey. The IFLS queried wives of household heads as to whether they, a family member, a friend, or someone else they knew had ever used a particular health facility, such as a health center (section PP of Book I, excerpted in Appendix B). When women responded positively, they were asked to provide the name and location of a facility of that type. When women responded negatively, they were asked if they knew of any facilities of that type, and if so, were asked about the name and location of the facility. These responses provided one source of information regarding health facilities of relevance to community members. Information was collected for four types of facilities/providers: government health centers and subcenters; private clinics; private doctors' practices; the practices of nurses, midwives, and paramedics; and traditional practitioners.
In Indonesia health facilities are also a source of contraceptives. Ever married women between the ages of 15 and 49 were asked whether they knew about various of methods of contraception (Section CX, Book IV, excerpted in Appendix B). When women knew of a method, they were asked to identify the specific facility from which they could obtain that method. For three methods (oral contraceptives, IUDs, and injectables), the name and location of the facility that the woman mentioned was added to the list of health providers if it fell into one of the five strata to be visited by the CFS team. The information from the "knowledge of contraceptive methods" section is the only source of information about the names and locations of community health posts.
The two sources of household information about health facilities are not tied solely to use of those facilities/providers by household members. Though it is possible (and probable) that someone in the household has used the facility that is mentioned, any facility known to the respondent may be mentioned. An alternative procedure would be to base the list on facilities the respondent (or another household member) has actually used in the recent past. We rejected this approach because we felt it would result in a more limited picture of community health care options (since use of health care is sporadic), and possibly be biased by factors such as what illnesses were common around the time of the interview.
The lists of schools were obtained in a slightly different manner. The respondent to the household roster (Section AR, Book I, excerpted in Appendix B) provided the name and location of all schools currently attended by household members under 25 years of age. Consequently, the lists of schools compiled from household information are all schools attended by at least one member of at least one IFLS household. For each enumeration area eight lists of facilities (one per strata) were constructed based on the combined household responses from that EA. Tables 3.1 and 3.2 (Overview and Field Report) provide the cumulative distributions of the numbers of facilities (by strata) identified within EAs. For example, the combined number of health centers identified was less than six in 80 percent of the 132 rural EAs in which we interviewed. The combined numbers of health centers identified was less than six in 68 percent of the 189 urban EAs in which we interviewed. Thus, on average, the combined household responses in urban EAs generate a longer list of health centers than do the combined responses in rural EAs. On average, the lists are longer in urban areas than in rural areas for doctors/clinics and all levels of schools as well. However, on average, the lists are longer in rural areas than in urban areas for nurses/midwives and for traditional practitioners.
Weighting
Household Survey Weighting
The IFLS Household Survey was designed to support a range of analyses based on a smaller, but richly detailed micro-level database covering a wide array of demographic, economic, and health outcomes. The survey was not envisioned as a database to produce national-level or provincial-level estimates of demographic or economic variables. (Other Indonesian surveys such as the SUSENAS are better suited for this purpose.) The public use file does include a series of household and individual analytic weights so that analysts can adjust, when appropriate, for the IFLS household and within-household sampling procedures. The weights are discussed further in The 1993 Indonesian Family Life Survey: Appendix C, Household Codebook (DRU-1195/4-NICHD/AID).
Household weights
The household weights are designed to correct for the over-sampling of urban EAs and EAs in smaller provinces discussed above and summarized in Table 2.1 (Overview and Field Report), as well as the differential sampling rates in urban and rural EAs. When the household weights are applied to the IFLS household sample, the resulting weighted distribution will reflect the 1993 distribution of households by urban and rural status within each of the 13 Indonesian provinces covered by the IFLS. The 1993 distribution of households by province and urban/rural status was generated from 1993 projected population counts provided by BPS and from average household sizes computed from the 1993 SUSENAS. BPS projected population counts were divided by average household sizes to get an estimate of the number of households in 1993 in each province/urban–rural strata.
Individual weights
The public use file contains three types of individual weights: respondent weights, roster weights, and anthropometry weights.
Respondent weights. The respondent weights are designed to adjust for the within household sampling scheme used to select respondents for detailed interview. From the household roster, the number of household members eligible to be a Book III, IV or V respondent within each household was determined based on the intra-household sampling rules discussed above. Sampling probabilities were then computed for individuals in each of four sampling groups:
1) household heads and their spouse;
2) among remaining members, individuals age 50 or over and their spouse;
3) among remaining members, individuals age 15-49 and their spouse;
4) children of household head/spouse age 0-14 (includes fostered children).
Individuals in the third group were eligible for interview in one out of every four households, so individuals in that group had only a 25 percent probability of selection in addition to their probability of selection within that group. Furthermore, a household could have a maximum of four Book III respondents (see the earlier discussion of the within household sampling rules)/ Because only 13 households had more than 4 selected respondents, no additional adjustment was made to the weights for these cases. The computed sampling probability for the individual respondent was then inverted to create a respondent weight for that person. Only eligible respondents of Books III, IV or V were given a respondent weight; respondents for those books who were incorrectly chosen by interviewers were given a respondent weight of zero. Examples of such “ineligible” respondents are children age 0-14 who are not biological or adopted children of the household head and spouse but who have a parent in the household, and individuals in the third group who were interviewed even though the household was not in the 25 percent of the sample where such respondents were eligible for interview. The respondent weight (i.e., the inverted sampling probability) was then normalized within each of the sampling groups above. By construction, this normalized weight sums to the number of eligible respondents within the respondent’s sampling group across the 7,224 households where a Book I was completed. Finally, the normalized respondent weight was capped at a value of 3 (99 percent had a weight of 3 or less) to adjust for outliers: individuals with tiny probabilities of selection and thus given very large weights could distort weighted tabulations.
Roster weights. The roster weights are designed so that the weighted age and sex distribution of individuals in the household roster data will reflect the 1993 population age and sex distribution by urban and rural strata within the 13 provinces covered by the survey. Five-year age groupings were used, where individuals age 75 and older were treated as one group. The population distribution was based on data from the 1993 SUSENAS. The roster weight is the ratio of the 1993 SUSENAS population proportion to the household roster proportion for the given province/urban-rural/sex/age group strata into which the individual falls. A roster weight was calculated for all household members listed in the roster (Book I, section AR). If the individual’s age was missing, an age group for the individual was imputed. The imputation involved examining the age of the individual’s spouse and children; if the individual was a Book III, IV or V respondent, dates and ages provided in those sections were used as part of the imputation.
Anthropometry weights. The anthropometry weights are designed to account for the intra-household sampling scheme used to select the respondents who were weighed and measured. All respondents of Books III, IV or V and any additional children under age 6 living in the household were eligible for anthropometric measurement. Respondents of Books III, IV and V who were measured were given an anthropometry weight equal to their respondent weight (unnormed and uncapped); other children under age 6 were given the household weight (based on the 7,224 household sample). Household members who were measured but not eligible (i.e., they did not fit the selection criteria) were given an anthropometry weight of zero. The initial anthropometry weight was then normalized to sum to the number of those across all households who were eligible to be measured, to account for the fact that not all household members eligible for anthropometric measurement were actually measured. Finally, as with the respondent weight, the anthropometry weight was capped at 3 to control for those with very small probabilities of selection.
Community and Facility Survey Weighting
The CFS was designed to provide extensive community and facility information to complement the household data. The CFS was not designed to produce nationallyrepresentative estimates of community and facility distributions or characteristics. The weights are included so that users can adjust for sampling procedures in their analyses. The CFS database has two basic sets of weights: community weights and facility weights.
Community Weights
The community weights are designed to correct for the over-sampling of urban EAs and EAs in smaller provinces. When weighted, the CFS communities reflect the number of EAs in the province/urban-rural strata in which the community lies. The total number of EAs in a given province and urban-rural strata was computed using 1993 SUSENAS sampling frame data from BPS. The community weight variable is the ratio of the number of actual EAs to the number of sampled EAs.
Facility Weights
Ideally a facility should receive a weight that is equal to that facility's sampling probability, where the sampling probability is a function of the sampling scheme and the sampling frame.
As discussed in the Sample Design and Response Rates section, the sampling frame for the facility survey is generated by household responses to questions about relevant facilities. This frame is incomplete to the extent that the sample of household respondents fails to identify all facilities of relevance to the population of the EA. The sampling scheme specifies that the probability of being sampled is proportional to market share.The construction of weights based on sampling probabilities is complicated by the fact that we do not each facility's true market share. Instead, we know the market share that a particular facility captures among the sample of household respondents in the EA. We use a model of market shares to simulate observed market shares, assuming a fixed number of household respondents and multinomial sampling. Comparison of the simulated outcomes to the observed outcomes yields an estimate of the true number facilities in each EA. The estimated number of facilities in each EA specifies the estimated market share and thus the rank for each facility in the EA.
The next step is to determine the place of each observed facility in the estimated distribution of all facilities and their associated market shares. We do not know the true market share (or even the rank) of an observed facility among all facilities. Instead, we observe a facility's rank (as determined by the number of respondents mentioning that facility) among those facilities identified by our sample of EA residents. This observed rank may or may or not be the true rank. For example, the most frequently mentioned facility among sampled EA residents might be only the second or third most frequently mentioned facility if one were to interview all EA residents.
Although the observed rank does not necessarily equal the true rank, it provides information about the true rank. Using the observed rank we make a probabalistic determination of each facility's true rank. We then determine its sampling probability using this model. Our final weight can be summarized as an estimate of the probability that we would sample an observed facility if we conducted another survey using the same sample design.
Data Collection Notes
Approximately 150 field staff were hired to conduct the IFLS Household Survey, while approximately 80 field staff were hired to conduct the Community-Facility Survey. These staff were recruited from the geographic regions in which fieldwork was taking place to ensure fluency in the local languages. Population Center officials, affiliated with universities throughout Indonesia, were instrumental in the recruitment of field staff. Due to the complex nature of the survey, field personnel were required to have completed some college; most, in fact, were young, bright, recent college graduates who were embarking on the first job of their careers.
The IFLS training and field work was conducted in two rounds which overlapped by six weeks; a separate training was held for each round. The first round include the two provinces of Lampung and West Java; training was conducted centrally in a location outside of Jakarta. The second round include the other eleven provinces and training was conducted concurrently in three sites: Jakarta, Padang (West Sumatra), and Malang (East Java). Each round began with approximately three weeks of training to provide in-depth classroom instruction and field practice with the entire questionnaire. Training consisted of the following components: an overview of the study; introduction to survey research; appropriate techniques for asking questions, recording responses, and probing; procedures for identifying sample households, editing one's own work, quality control, and sample control; and a detailed review of all instruments. In addition to lecture presentations, a variety of teaching aids were used in training. For example, the questionnaire was broken into components and discussed question-by-question. Trainers demonstrated the content and flow of portions of the questionnaire, then used round robin and mock interviewing techniques to familiarize trainees with the questionnaire and interviewing approach. Trainees were also subjected to periodic quizzes. A training manual and manual detailing questions-by-question objectives was provided to support classroom training. Following training, field staff were observed closely in the field for one-week by LD supervisory staff and RAND personnel.
Supervisors and editors were selected from the top performers in the applicant pool.They received additional intensive training during the week of field practice. This program supplemented interviewer training by covering the following topics: assigning workloads to interviewers, sample control, observation and validation, production reporting to Jakarta, assisting anthropometrists, and handling crises. In addition, a specialized training program was conducted for anthropometrists. These field staff first participated in the general interviewer training so that they were familiar with the study objectives, interviewing techniques, and content of the household questionnaire. They then participated in a separate specialized training program focusing on anthropometric measurement.
CFS training occurred at the same time as household training and covered many of the same topics (study overview, survey research methods, recording responses). Additionally, CFS field staff received training in gaining cooperation from health care providers and health facility and school administrators. Practice interviews were conducted at health facilities and schools near the training site. Supervisors were trained in using the household responses to compile lists of facilities on the SDI and SDII forms and to draw a sample for each facility type.
Field Work:
The IFLS field work began once the training was completed and continued for three to four months depending upon the sample size assigned to each team. The first round of field work, for enumeration areas in Lampung and West Java, was launched during the last week of August 1993 and concluded in mid-November. The second round of field work, covering the other eleven provinces, began in mid-September and continued through January 1994. During the field work, each team was assigned a list of enumeration areas. All households were interviewed in an area before moving on to the next. Field work in each area was conducted in three to five days. Travel time between enumeration areas took one day on average. Before beginning work in a new area, the Supervisor traveled ahead to the next area, obtained permissions, area lists and maps, and set up the base-camp. He also arranged travel for the team from area to area.
The household sampling plan, described earlier, was implemented in the field by the team supervisor. In each enumeration area (Wilcah), a list and map was obtained from the local BPS office by the supervisor. The supervisor reviewed the list with an official at BPS and/or the local village leader for accuracy. Demolished, vacant, non-existent or duplicate structures were removed from the list during this pre-sampling review. In the event that more than 25 percent of the households on the Wilcah list were not good, the Supervisor notified the LD project director in Jakarta and replaced the Wilcah with the next closest one as assigned by the LD project director.
Using a manual systematic random sampling method, the Supervisor selected the appropriate number of households from the Wilcah list: twenty households for an urban area or thirty households for a rural area. An additional ten reserve households were also selected using the same method. These reserve households were available to use to replace households that were discovered to be demolished, vacant , non-existent or duplicate after sampling. Once the sample was drawn, the Supervisor assigned the selected households to his interviewers. Selected households were marked with a special sticker to facilitate identification for return interviews and at a later date, should a second round of data collection take place.
Generally, interviewers worked in pairs so that a household interview could be completely efficiently and so that they could be available to assist one another as needed. This allowed, for example, one interviewer to be with the household head in one room answering Book II, while the spouse of the head was with another interviewer answering Book IV. Each household interviewer completed approximately 1.25 households per day. This included time to make introductions and appointments, conduct the interview, schedule any return visits, and travel to and from households. In extenuating circumstances, it was necessary to take more than the expected amount of time to complete an area. Such circumstances included encountering areas with larger than average household sizes, greater than normal distance between households, and more difficulty in finding respondents at home. In between interviews, an interviewer conducted a field edit of his/her own work.
Following the field edit, the team editor carefully reviewed the completed questionnaire instruments. Following the edit, the editors were instructed to ask the interviewer to clarify any inconsistencies. In some cases, the editor required the interviewer to return to the household to complete sections of the questionnaire that were left missing. Serious interviewer deficiencies were reported to the Supervisor who also edited at least two households per area. After interviewing had begun in a household, the anthropometrist visited the household to weigh and measure the adult and child respondents and any other children in the household aged 5 or younger who were available. Additional duties of the anthropometrist included conducting verifications of four households per area and assisting the editor by editing Books II and V.
In addition to the work of the field editor, two other critical quality control functions were implemented, namely observation and verification. Using an observation form which covered key techniques of interviewing, the Supervisor typically observed two different interviewer sessions per area and the Korlap or Jaksup observed one to two sessions per area. The observers gave feedback to the interviewers, providing correction and instruction if necessary. This feedback also provided an opportunity for the observers to give positive reinforcement to interviewers for good work. Verification was also performed to confirm that a household was visited, to check household composition and to validate interview data. Using a verification form, the supervisor and anthropometrist verified two and four households per area, respectively.
Each HH team had a companion CF team. The CF team followed behind the HH team, typically with a lag of one or two enumeration areas. Messengers were hired to transfer NCR pages from the household questionnaires (on which facility names and locations were recorded) to the CF team supervisor so that the facility sample could be drawn. The messenger also kept the CF and HH teams informed of each other's whereabouts and progress. CF interviewers edited their own work.
Second Round Interviewer Retraining:
Early in the second round of field work, RAND staff observing the interviewers in the field concluded that household team supervisors and interviewers demonstrated less proficiency in questionnaire administration and field procedures than was observed for the first round field staff. (This may have been due to the fact that there were more trainees in the second round and that training occurred in three locations rather than one.) Field work was halted and a three-week period of retraining was immediately launched. Several RAND project staff traveled to Jakarta to join in-country RAND staff in planning and implementing the retraining program.
The first week of the retraining period was spent assessing training needs and planning the targeted retraining program with LD staff. The retraining program aimed to clarify sampling procedures, review difficult sections of the questionnaire, review quality control measures, and insure that job descriptions were understood. During the second week, a centralized training program involving the LD field coordinators was conducted in Jakarta. These individuals then served as trainers at the retraining sites during the third week. The field staff were assigned to one of four retraining sites, according to geographic proximity, to participate in the retraining program where they were instructed by the LD field coordinators. RAND staff were also located at each training site for the duration of the retraining to assist as instructors and to insure the quality and completeness of the program. Field work resumed immediately following the retraining period. Careful observation during the first few weeks by RAND and LD staff demonstrated a significant improvement in the field staff’s proficiency in questionnaire administration and field operations.
Field work had been conducted in 32 enumeration areas (a total of 720 households) prior to the second-round retraining. These households can be identified as described in The 1993 Indonesian Family Life Survey: Appendix C, Household Codebook. (DRU-1195/4-NICHD/AID). The quality of the information collected for these households has not been fully assessed; users may want to examine these cases for problems with data quality.