WLD_2015-2020_FDPs_v01_M
Harmonized Database of Forcibly Displaced Populations and Their Hosts 2015-2020
Name | Country code |
---|---|
Ecuador | ECU |
Peru | PER |
Niger | NER |
Chad | TCD |
Ethiopia | ETH |
Uganda | UGA |
Bangladesh | BGD |
Jordan | JOR |
Iraq | IRQ |
Lebanon | LBN |
Other Household Survey [hh/oth]
Building on recent World Bank efforts to collect representative data on forcibly displaced peoples (FDPs) and theirhosts in several countries, the Bank has invested in a harmonization effort of representative surveys covering 10countries across five regions that hosted displaced people in the period 2015-2020.
Sample survey data [ssd]
Household
Version 1.0
2023-10-04
There have been recent efforts to close data and evidence gaps in a representative way by including displaced populations in national household surveys (for instance, in Chad, Niger, and Uganda) or by generating data on specific populations and displacement events (for example, Syrian refugees in the Mashreq or Rohingya refugees in Cox’s Bazar, Bangladesh). Since 2015/16, some 12 countries have sought to systematically include refugees and other forcibly displaced populations in key surveys.
Building on these country-level efforts, investing in creating comparable data through an ex-post harmonization is an important step to help cross-country comparisons and support analytics that can inform policies at the global level. Recognizing this need, the World Bank Poverty and Equity Team has engaged in a data harmonization effort across 10 countries, designed to support analytics that can highlight how country conditions, including diverse refugee policies and programs, may shape outcomes. The results obtained can orient future policy. The data harmonization effort builds on important seed investments, while recognizing that an adequate evidence base on forced displacement remains an aspirational goal.
The displacement events and context considered are diverse is nature. Venezuela is going through one of the deepest economic crises in history. Its Gross Domestic Product per capita halved between 2013 and 2018 and by then 9 out of 10 people lived in poverty.1F A combination of factors led to the mass exodus of Venezuelans out of their country. Three countries in Latin America host 72 percent of displaced Venezuelans: Colombia (1.4 million), Peru (1 million), and Ecuador (400 thousand). However, Venezuelan migrants represent only between 2 and 3 percent of the local populations in those countries. In 2017, many Rohingya displaced arrived in the Cox’s Bazar district of Bangladesh, fleeing violence from Myanmar. Within a period of four months, some 724,000 newly arrived persons joined other Rohingya who had fled earlier waves of violence. By the end of 2018, nearly 2,000 campsites in Cox’s Bazar hosted around 912,000 Rohingya, more than doubling the population living in the Cox’s Bazar sub-districts of Teknaf and Ukhia.
In Sub-Saharan Africa, the protracted crises worsened in the years 2013 and 2015. With refugee populations of more than one million each, Uganda and Ethiopia are currently the third and sixth largest refugee-hosting nations in the world. In Sub-Saharan Africa, most refugees settle in camps located in areas bordering their country of origin, some of which also suffer from domestic conflict. While some displacement crises in the region date from decades ago, the influx of displaced people between 2014 and 2018 almost doubled the number of asylum seekers in Eastern Africa. By contrast, the number of Syrian households in the three countries of origin covered in this exercise has remained stable since 2013. The Syrian crisis has caused one of the largest episodes of forced displacement since World War II. In effect, more than half of Syria’s prewar population has been forcibly displaced. As of 2016, five years from the start of the conflict, almost 5 million Syrians were registered as refugees in other countries, a number that has increased to 5.4 million by 2023. A handful of Syria’s neighbors, like Turkey, Iraq, Jordan, and Lebanon, continue hosting the bulk of Syrian refugees.
The selection of variables included in the harmonized dataset is oriented toward building the evidence needed to support the pivot from the humanitarian to development response in refugee policy. As with any harmonization effort, there is a substantial tradeoff between broadening the set of variables included and the ability to compare across many settings. In this case, the variables selected for harmonization may be considered a minimum common denominator which would be needed to be able to contrast different displacement contexts. The harmonized variables include key demographics (e.g., age, gender), welfare indicators (e.g., housing and access to basic services), human capital indicators (education), and economic variables (e.g., labor, sources of income, assets). Such indicators are important for the design of policies oriented toward the protection and self-sufficiency of FDPs and to mitigate real and perceived risks to hosts.
This type of harmonization exercise conducted ex-post poses substantial challenges because of the diversity of displacement contexts considered and the differing strategies for generating statistics from appropriate surveys. The surveys included in this exercise differ in their objectives at the time they were implemented. For instance, while some were designed to understand the implications on crises as they were ongoing (Syria, Venezuela, Rohingya), others were designed to include displaced populations into national data collection efforts (such as in sub-Saharan Africa). Just as there is significant heterogeneity within FCS, so there is also heterogeneity among forcibly displaced populations. We observe substantial variation in legal status and protection; pre-displacement socio-economic characteristics; policy environments and other contextual conditions in the hosting country; and the potential for integration in the host society and/or for return to FDPs’ home country.
While variables such as demographics and labor market participation have been harmonized across numerous datasets globally, standard definitions are lacking for some categories related to forced displacement. For example, the definition of “host” can range from designating only persons who live near a refugee camp to including any national of a country hosting refugees. The notion of forcible displacement is also relative to the specific country context. In working to harmonize the dataset, this complexity calls for particular attention to the way we categorize households and individuals as hosts, refugees, asylum seekers, displaced immigrants, or internally displaced people (IDPs). Finally, certain survey modules, such as those on consumption expenditure, are not harmonized.
The datasets included in the harmonization effort cover key recent displacement contexts: the Venezuelan influx in Latin America’s Andean states; the Syrian crisis in the Mashreq; the Rohingya displacement in Bangladesh; and forcible displacement in Sub-Saharan Africa (Sahel and East Africa). The harmonization exercise encompasses 10 different surveys. These include nationally representative surveys with a separate representative stratum for displaced populations; sub-national representative surveys covering displaced populations and their host communities; and surveys designed specifically to provide insights on displacement contexts. Most of the surveys were collected between 2015 and 2020.
Forcibly displaced populations and their hosts communities.
Name | Affiliation |
---|---|
Poverty and Equity Global Practice | World Bank |
Obtaining representative information on hosts and displaced populations in a single survey is a complex endeavor. The surveys used in the harmonization exercise combined traditional and nontraditional sampling frames, telephone data, geospatial information, and listing exercises to design representative surveys. All these efforts required introducing innovations to overcome lack of updated sampling frames for host populations or inexistent sampling frames for displaced populations. The surveys also made context-specific decisions in terms of how to stratify the sample to cover different groups and areas. 2F
One of the earliest efforts, the Syrian Refugee and Host Community Surveys (SRHCS), was implemented over 2015–2016 in Lebanon, Jordan, and the Kurdistan region of Iraq. In all three settings, the main challenge to implementing a survey that would yield estimates representative of the refugee and host community populations, was the lack of an updated or comprehensive sample frame, including for hosting populations and especially for displaced populations. Defining a sampling strategy to yield representative samples of hosts and displaced populations in this context involved two key innovations. The first was the creation of a sample frame feasible for household listing operations from large geographical divisions where it did not exist. This was the case in Lebanon and among the two largest refugee camps in Jordan. In Lebanon, cartographic divisions of the country were only available for large areas and had to be segmented and sub-segmented based on satellite imagery and dwelling counts to yield geographic areas small enough for listing. These segmentations attempted to divide the larger areas into equal population size subdivisions or segments, much the same way as enumeration areas are generated. Similarly, for the two largest refugee camps in Jordan, Zaatari, and Azraq, satellite imagery was used to divide the camps into mutually exhaustive and exclusive sampling units of roughly equal population size. The second innovation was the use of available information from different sources on displaced population prevalence which were incorporated into the sample frames of host population prevalence. 3F
In the case of Cox’s Bazar, Bangladesh, the survey was designed to be representative of post-2017 displaced Rohingya, hosts in high exposure areas, defined as up to 3 hours walking time from Rohingya campsites; and hosts in low exposure areas in Cox’s Bazar. Two different data collection exercises were carried out to assess the prevalence of Rohingya displaced outside the camps to inform the sampling strategy for Rohingya displaced. Administrative data from humanitarian agencies were used to design the sampling frame within the camps. One innovation was the use of drone imagery and digital maps to implement the listing within camps and host communities. Two different open-source data sets were used to inform the design of the host strata and help generate the host enumeration areas. Government data such as the 2011 population census and administrative shapefiles were also used. 4F
In Uganda, the survey is representative of the refugee and host community population of Uganda at the national level. Moreover, it is representative of the refugee and host population in the regions of West Nile and Southwest, and the city of Kampala. The host population is defined as the native population in districts where refugee settlements are situated. The survey used two different sampling frames. The first one, based on the list of Enumeration Areas (EAs) and the information of used to determine the samples for the host and refugee populations of Kampala, and the host populations in West Nile and Southwest. The second one was a newly developed sampling frame for the refugee population in the West Nile and Southwest regions. Primary sampling units were selected in a first stage using a Probability Proportional to Size (PPS) sampling method. Between the first and second stages, household listing operations were carried out in the selected enumeration areas. 5F
In the case of Ecuador despite reliable and up-to-date sampling frame for the national census, the lack of information on the numbers of Venezuelans displaced in Ecuador and their locations in the country posed challenges for the design and implementation of the EPEC. This survey used Call Detail Records and External Detail Records between June 2018 and March 2019 provided by the main phone company Telefónica de Ecuador. Telefónica de Ecuador analyzed their database to determine how many of their active mobile phones in each primary sampling unit (PSU) were likely to belong to Venezuelans displaced abroad, based on the name of the account holder or the volume of calls and messages to/from Venezuela. To estimate the total number of Venezuelans in each PSU, figures were adjusted using Telefónica’s market shares (to estimate the total number of Venezuelan phones from all companies in each PSU) and the fraction of the population using mobile phones. In the first sampling stage, were stratified into three categories depending on the Venezuelan migrant density. Within each stratum, the sample was selected with probability proportional to the number of households reported by the 2010 Census. In the second sampling stage, all households in each of the selected sectors were listed and stratified into three categories considering nationality and demographic composition. Within each stratum, the sample was selected by systematic equal-probability sampling. 6F
While each survey includes sampling weights to aggregate to the host population and displaced population, these weights need to be adjusted when the harmonized data are pooled across countries or regions. This is because whilst a surveyed displaced population group may account for a relatively large share of the displaced persons within a given country, they may correspond to just a small share of the hosting country’s overall population (or vice versa). When pooling data across surveys for comparisons or regression purposes, the sample survey weights should be rescaled.
A possible approach to reweighting the data, is to assign each country equal weights, while preserving the share of displaced populations in each country. Similarly, when aggregating across countries in a specific displacement context, each displaced group can be weighted equally to produce summary statistics for the displacement context. For instance, weighting equally each sample of Syrian refugees in Jordan, Lebanon, and Kurdistan-Iraq when creating summary statistics for the Syrian refugee context in the Mashreq. An alternate approach could be to rescale the weights of these three sample groups to aggregate up to their contribution to the total of all displaced persons in the Mashreq, total Syrian refugees in the Mashreq, or the total of Syrian refugees in the world. Given that our harmonized dataset covers only 10 countries and by no means provides comprehensive coverage of displaced populations, for simplicity, we suggest using equal weights for producing summary statistics by displacement context.
Start | End |
---|---|
2015 | 2020 |
Data collection varied by survey.
Is signing of a confidentiality declaration required? |
---|
yes |
Public
Name | Affiliation | |
---|---|---|
Maria Genoni | World Bank | mgenoni@worldbank.org |
Nandini Krishnan | World Bank | nkrishnan@worldbank.org |
DDI_WLD_2015-2020_FDPs_v01_M_WB
Name | Affiliation | Role |
---|---|---|
Development Data Group | The World Bank | Documentation of the DDI |
2023-11-14
Version 01 (November 2023)
This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser. To learn more about cookies, click here.