The household registration system known as ho khau has been a part of the fabric of life in Vietnam for over 50 years. The system was used as an instrument of public security, economic planning, and control of migration, at a time when the state played a stronger role in direct management of the economy and the life of its citizens. Although the system has become less rigid over time, concerns persist that ho khau limits the rights and access to public services of those who lack permanent registration in their place of residence. Due largely to data constraints, however, previous discussions about the system have relied largely on anecdotal or partial information.
Drawing from historical roots as well as the similar model of China’s hukou, the ho khau system was established in Vietnam in 1964. The 1964 law established the basic parameters of the system: every citizen was to be registered as a resident in one and only household at the place of permanent residence, and movements could take place only with the permission of authorities. Controlling migration to cities was part of the system’s early motivation, and the system’s ties to rationing, public services, and employment made it an effective check on unsanctioned migration. Transfer of one’s ho khau from one place to another was possible in principle but challenging in practice.
The force of the system has diminished since the launch of Doi Moi as well as a series of reforms starting in 2006. Most critically, it is no longer necessary to obtain permission from the local authorities in the place of departure to register in a new location. Additionally, obtaining temporary registration status in a new location is no longer difficult. However, in recent years the direction of policy changes regarding ho khau has been varied. A 2013 law explicitly recognized the authority of local authorities to set their own policies regarding registration, and some cities have tightened the requirements for obtaining permanent status.
Understanding of the system has been hampered by the fact that those without permanent registration have not appeared in most conventional sources of socioeconomic data. To gather data for this project, a survey of 5000 respondents in five provinces was done in June-July 2015. The samples are representative of the population in 5 provinces – Ho Chi Minh City, Ha Noi, Da Nang, Binh Duong and Dak Nong. Those five provinces/cities are among the provinces with the highest rate of migration as estimated using data from Population Census 2009.
Kind of Data
Sample survey data [ssd]
Unit of Analysis
To gather data for this project, a survey of 5000 respondents in five provinces was done in June-July 2015. The samples are representative of the population in 5 provinces – Ho Chi Minh City, Ha Noi, Da Nang, Binh Duong and Dak Nong. Those five provinces/cities are among the provinces with the highest rate of migration as estimated using data from Population Census 2009.
5 provinces – Ho Chi Minh City, Ha Noi, Da Nang, Binh Duong and Dak Nong.
Producers and sponsors
The World Bank
Sampling for the Household Registration Survey was conducted in two stages. The two stages were selection of 250 enumeration areas (50 EAs in each of 5 provinces) and then selection of 20 households in each selected EA, resulting in a total sample size of 5000 households. The EAs were selected using Probability Proportional to Size (PPS) method based on the square number of migrants in each EA, with the aim to increase the probability of being selected for EAs with higher number of migrants. “Migrants” were defined using the census data as those who lived in a different province five years previous to the census. The 2009 Population Census data was used as the sample frame for the selection of EAs. To make sure the sampling frame was accurate and up to date, EA leaders of the sampled EAs were asked to collection information of all households regardless of registration status at their ward a month before the actual fieldwork. Information collected include name of head of household, address, gender, age of household’s head, household phone number, residence registration status of household, and place of their registration 5 years ago. All households on the resulting lists were found to have either temporary or permanent registration in their current place of residence.
Using these lists, selection of survey households was stratified at the EA level to ensure a substantial surveyed population of households without permanent registration. In each EA random selection was conducted of 12 households with temporary registration status and 8 households with permanent registration status. For EAs where the number of temporary registration households was less than 12, all of the temporary registration households were selected and additional permanent registration households were selected to ensure that each EA had 20 survey households. Sampling weights were calculated taking into the account the selection rules for the first and second stages of the survey.
Survey weights were calculated based on the probability of selection . First, the probability of selection of each selected EA was calculated. The formula is as follows:
Where: P_ij is the probability of EA j in province i to be selected in the sample
n_ij^ is the number of migrant households in EA j in province i, according to Population Census 2009
N_j^ is the total number of migrant households in province i, according to Population Census 2009
Second, the probability of a household being selected within an EA (conditional on the EA being part of the sample) was calculated . The fomula is as follows
Probability of being selected for non-permanent registrant households at EA level:
P_jm=(m_j^ )/(M_j^ )
Where:? P?_jm is the probability of non-permanent registrant household m in EA j being selected in the sample
m_j^ is the number of non-permanent registrant households in EA j selected for the survey
M_j^ is the total number of non-permanent registrant households in EA jat the time of the survey
Probability of being selected for permanent registrant households at EA level:
P_jp=(p_j^ )/(P_j^ )
Where: P_jp is the probability of permanent registrant household p in EA j being selected in the sample
p_j^ is the number of permanent registrant households in EA j selected for the survey
P_j^ is the total number of permanent registrant households in EA jat the time of the survey
Therefore, weight for non-permanent registrant household is:
And weight for permanent registrant household is:
Dates of Data Collection
Data Collection Mode
Computer Assisted Personal Interview [capi]
Monitoring and Quality Control
We provided the enumerators with phone numbers of MDRI trainers and data analysts so that the enumerators could consult us upon any situations they were not able to judge themselves, or inform us of any noticeable incidents in the field. There are separate hotlines for (i) the technical details of the questionnaire; (ii) use of tablets and soft-wares; (iii) fieldwork plan and human resources; (iv) samples and (v) administrative and financial procedures. The questions directed to the hotlines from all these domains will be discussed, answered and shared among all the enumerators through daily emails.
In all survey sites, MDRI researchers directly supervised the teams at field. The supervisors attended interviews with the enumerators, evaluating their interviewing skills and providing feedback immediately after the interviews with the enumerator under supervision as well as his/her team members. The supervisors in fact played a great role in enhancing the quality of the interviews by advising enumerators, reporting back to MDRI team to circulate lessons and experience at the end of each day among the enumerators.MDRI analysts conducted direct supervision trips in all the 5 provinces for the first 3 days of the survey (from 29 June – 1 July in Hanoi and Da Nang, and from 6 July – 8 July in Dak Nong, Ho Chi Minh City and Binh Duong).
Data Collection Notes
Composition of Survey Teams
Based on the sample size in each province, 6-9 teams of 3 members were dispatched to the field. The trainers strived to comprise teams in such a way that each team is a good mix of experienced enumerators with fresh enumerators; male members with female ones; and local enumerators who are used to the geography of the survey sites and those who are not so familiar with the provinces.
The reported survey team organization has proven its advantages. Firstly, while each enumerator in a team conducted their work independently, they could support each other in various aspects, including travelling to sites, exchanging experience at the end of the day. Secondly, the peer feedback mechanism was especially effective as the more experienced and agile team member would help coach the other one and motivate him/her to perform better in the next day.
Logistics Arrangements and Fieldwork Procedures
Prior to departure, each team is equipped with adequate information and fieldwork documents, including:
• Fieldwork plan (with detailed schedule) – Refer to 9.3 Annex 2 for a sample of fieldwork plan.
• Contact information of local authorities (Provincial and Commune Statistics officers)
• List of households used for interview, including both official households (12 temporary residents and 8 permanent residents) and replacements
• Instruction on working with local authorities
• Printed illustration of quality-of-life assessment scale and categorization of sectors and professions
• Letter of introduction and authorization of survey implementation (from Mekong Development Research Institute to the Provincial Departments of Statistics and from the provincial authorities to district and communes)
• Tablet, Paper questionnaire (in case of tablet’s failures), training manual with contact information of relevant personnel in case of assistance
Immediately after the training, the enumerators were requested to communicate with the contact persons in each enumeration area as soon as possible to re-inform them of the fieldwork plan and raise any necessary issue (e.g. re-checking of the list of temporary residents if the received lists had too few of them; change in the fieldwork plan due to the local conditions), and request them to announce to concerned parties, including the interviewed households and local guides/interpreters if necessary, about the fieldwork.
Each enumerator team was requested to submit a report after they finish each enumeration area. The report is in the form of tablet’s e-questionnaire and includes information on:
• The covered enumeration areas
• The number of successful interviews with temporary residents and permanent residents
• The issues encountered during fieldwork
If for some reasons the team could not submit their report on time (Internet failure, change in work plan, distant areas, etc.), they must provide oral reporting to MDRI team via phone calls or send an email and submit their reports later as soon as they could. MDRI team would match the number of received questionnaire forms and the reports to identify any missing data and inform the enumerators to recheck these cases.
Mekong Development Research Institute
The questionnaire was mostly adapted from the Vietnam Household Living Standard Survey (VHLSS), and the Urban Poverty Survey (UPS) with appropriate adjustment and supplement of a number of questions to follow closely the objectives of this survey. The household questionnaire consists of a set of questions on the following contents:
• Demographic characteristics of household members with emphasis on their residence status in terms of both administrative management (permanent/temporary residence book) and real residential situation.
• Education of household members. Beside information on education level, the respondents are asked whether a household member attend school as “trai-tuyen” , how much “trai-tuyen” fee/enrolment fee, and difficulty in attending schools without permanent residence status.
• Health and health care, collecting information on medical status and health insurance card of household members.
• Labour and employment, asking household member’s employment status in the last 30 days; their most and second-most time-consuming employment during the last 30 days; and whether they had been asked about residence status when looking for job.
• Assets and housing conditions. This section collects information on household’s living conditions such as assets, housing types and areas, electricity, water and energy.
• Income and expenditure of households.
• Social inclusion and protection. The respondents are asked whether their household members participate in social organizations, activities, services, contribution; whether they benefit from any social project/policy; do they have any loans within the last 12 months; and to provide information about five of their friends at their residential area.
• Knowledge on the Law of Residence, current regulations on conditions for obtaining permanent residence, experience dealing with residence issues, and opinion on current household registration system of the respondents.
Managing and Cleaning the Data
Data were managed and cleaned each day immediately upon being received, which occurred at the same time as the fieldwork surveys. At the end of each workday, the survey teams were required to review all of the interviews conducted and transfer collected data to the server. The data received by the main server were downloaded and monitored by MDRI staff.
At this stage, MDRI assigned a technical team to work on the data. First, the team listened to interview records and used an application to detect enumerators’ errors. In this way, MDRI quickly identified and corrected the mistakes of the interviewers. Then the technical team proceeded with data cleaning by questionnaire, based on the following quantity and quality checking criteria.
• Quantity checking criteria: The number of questionnaires must be matched with the completed interviews and the questionnaires assigned to each individual in the field. According to the plan, each survey team conducted 20 household questionnaires in each village. All questionnaires were checked to ensure that they contained all essential information, and duplicated entries were eliminated.
• Quality checking criteria: Our staff performed a thorough examination of the practicality and logic of the data. If there was any suspicious or inconsistent information, the data management team re – listened to the records or contacted the respondents and survey teams for clarification via phone call. Necessary revisions would then be made.
Data cleaning was implemented by the following stages:
1. Identification of illogical values;
2. Software – based detection of errors for clarification and revision;
3. Information re-checking with respondents and/or enumerators via phone or through looking at the records;
4. Development and implementation of errors correction algorithms;
The list of detected and adjusted errors is attached in Annex 6.
Outlier detection methods
The data team applied a popular non - parametric method for outlier detection, which can be done with the following procedure:
1. Identify the first quartile Q1 (the 25th percentile data point)
2. Identify the third quartile Q3 (the 75th percentile data point)
3. Identify the inter-quartile range(IQR): IQR=Q3-Q1
4. Calculate lower limits (L) and upper limits (U) by the following formulas:
5. Detect outliers by the rule: An observation is an outlier if it lies below the lower bound or beyond the upper bound (i.e. less than L or greater than U)
The completed dataset for the “Household registration survey 2015” includes 9 files in STATA format (.dta):
• hrs_maindata: Information on the households, including: assets, housing, income, expenditures, social inclusion and social protection issues, household registration procedures
• hrs_muc1: Basic information on the household members
• hrs_muc2: Education of the household members
• hrs_muc3: Healthcare status of the household members
• hrs_muc4: Employment situation of the household members
• hrs_muc7cc2: Pension and unemployment benefits or severance payment situation of the household members
• hrs_muc7cc4: Regular social allowance/benefit of household members
• hrs_muc9: Households’ financial obligation
• hrs_muc9c19: Social network of households
Data format in Stata has the following structure:
• Each column represents one variable. The variables in the dataset are named by the location of the question within the questionnaire (for example, m1c2 in the data file captures the answer to question 2, section 1 in the questionnaire). There are three types of variables: (i) discrete choice; (ii) continuous and (iii) text (or string) variable. Discrete variables have value labels and variable labels, while continuous and text variables have variable labels only.
• Each row represents one observation. For instance, each row captures information on a household, a household member or a loan.
Linh Hoang Vu
Public use files, accessible to all
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
The World Bank Group. Vietnam Household Registration Study 2015. Ref. VNM_2015_HRS_v01_M. Dataset downloaded from [URL] on [date]