BRA_2025_ECoE-SOGIESC_v01_M
The Economic Cost of Exclusion Based on Sexual Orientation, Gender Identity and Expression, and Sex Characteristics in the Labor Market in Brazil 2025
ECoE-SOGIESC 2025
| Name | Country code |
|---|---|
| Brazil | BRA |
Labor Force Survey [hh/lfs]
This dataset was collected as part of the Economic Cost of Exclusion (ECoE) research program, a World Bank analytical initiative that examines the economic and fiscal consequences of exclusion based on sexual orientation, gender identity and expression, and sex characteristics (SOGIESC). The ECoE methodology has previously been applied in other countries, including Serbia and North Macedonia, to estimate the labor-market, productivity, and fiscal impacts associated with the exclusion of LGBTI+ populations.
The Brazil study adapts and expands the ECoE framework to the Brazilian context through a mixed-methods design combining a large-scale survey of LGBTI+ adults with qualitative research. In addition to estimating economic and fiscal costs, the study seeks to address critical data gaps related to labor-market outcomes, discrimination, and socioeconomic inclusion of LGBTI+ populations in Brazil.
Sample survey data [ssd]
Individual LGBTI+ adult respondents (18 years and older)
Version 1.0 (May 2026). Final anonymized public-use dataset. This version contains cleaned and validated survey data collected between June and September 2025 for the Economic Cost of Exclusion (ECoE) – Brazil study. Personal identifiers have been removed or anonymized to protect respondent confidentiality. The dataset includes derived analytical variables and survey weights used in the study's economic and fiscal cost analyses.
2026-05-29
This is the first public-use version of the ECoE – Brazil dataset. The dataset has been cleaned, validated, anonymized, and documented for dissemination through the World Bank Microdata Library. Direct identifiers and other potentially disclosive information have been removed or recoded to protect respondent confidentiality. The dataset includes survey weights, derived analytical variables, and accompanying documentation used in the Economic Cost of Exclusion in Brazil study.
The survey collected information on sexual orientation, gender identity and expression, sex characteristics (SOGIESC), demographic and socioeconomic characteristics, educational attainment, labor-market participation, employment status, unemployment, labor-force inactivity, occupation, income, informality, experiences of workplace discrimination and exclusion, and perceptions of inclusion in the workplace.
The study also gathered information on experiences of stigma, harassment, barriers to employment, career progression, identity concealment, and other forms of exclusion affecting labor market outcomes. Data were collected from LGBTI+ adults aged 18 years and older across all regions of Brazil through a combination of online and face-to-face interviews.
In addition to the quantitative survey, the study included qualitative data collection through focus group discussions conducted in Belém, Rio de Janeiro, Salvador, and São Paulo to better understand the mechanisms through which discrimination and exclusion affect labor market trajectories and socioeconomic outcomes.
South America; Latin America
The target population consisted of self-identified LGBTI+ adults aged 18 years and older residing in Brazil. Eligibility included individuals who identified with a sexual orientation, gender identity, gender expression, or sex characteristics that differ from heterosexual, cisgender, and endosex norms. Participants were recruited through a purposive, non-probability sampling approach using both online and in-person outreach strategies across all regions of the country.
For the qualitative component, the universe consisted of LGBTI+ adults aged 18 years and older participating in focus group discussions conducted in Belém, Rio de Janeiro, Salvador, and São Paulo.
| Name | Affiliation |
|---|---|
| Dominik Kohler | World Bank |
| Mariah Rafaela Cordeiro Gonzaga da Silva | World Bank |
| Andrew Flores | American University |
| Fernanda Fortes de Lena | Universitat Autònoma de Barcelona |
| Samuel Araújo Gomes da Silva | Fundação Oswaldo Cruz |
| Ana Maria Hermeto Camilo de Oliveira | Universidade Federal de Minas Gerais |
| Lucas Bulgarelli | Instituto Matizes |
| Hannah Maruci Aflalo | Instituto Matizes |
| Arthur Fontgaland | Instituto Matizes |
The study employed a purposive, non-probability sampling strategy to recruit LGBTI+ adults aged 18 years and older residing in Brazil. Due to the absence of a national sampling frame for LGBTI+ populations, probability-based sampling was not feasible. Recruitment combined online and in-person approaches to maximize participation across diverse geographic, socioeconomic, and demographic groups.
Data collection was conducted between June 12 and September 12, 2025, using a mixed-mode approach that combined Computer-Assisted Web Interviewing (CAWI) and Computer-Assisted Personal Interviewing (CAPI). Participants were recruited through digital outreach campaigns, social media, WhatsApp networks, partnerships with civil society organizations, community-based organizations, shelters, cultural spaces, LGBTI+ events, and targeted fieldwork in underserved and hard-to-reach communities.
To improve diversity and coverage, recruitment efforts were continuously monitored against demographic benchmarks derived from the 2022 Population Census and PNAD Contínua. Periodic quality-control reviews were conducted after approximately 5,000, 7,500, and 10,000 completed interviews to assess data quality, sample composition, and consistency. Additional outreach activities were implemented where necessary to increase participation among underrepresented groups and territories.
The final analytical sample consisted of 11,231 completed interviews. Because the sample was not randomly selected, survey weights and statistical adjustment procedures, including reweighting and matching techniques, were applied to improve comparability between the LGBTI+ sample and the general Brazilian population.
No major deviations from the planned sampling approach were identified during implementation. However, as anticipated in the study design, the survey relied on purposive, non-probability recruitment due to the absence of a sampling frame for LGBTI+ populations in Brazil.
Throughout data collection, continuous monitoring of sample composition revealed the need for additional targeted outreach to increase participation among underrepresented groups, including individuals living in rural areas, residents of peripheral communities, older adults, and highly vulnerable transgender and travesti populations. To address these gaps, recruitment efforts were adjusted through additional fieldwork, partnerships with community-based organizations, and targeted dissemination during major LGBTI+ events and conferences.
As a result, the final sample composition differed from the general Brazilian population in several respects, including age, educational attainment, and place of residence. Statistical adjustment procedures, including reweighting and matching techniques, were subsequently applied to improve comparability between the survey sample and national population benchmarks.
A conventional response rate cannot be calculated because the study employed a purposive, non-probability sampling design and no sampling frame for the target population exists in Brazil. Participants were recruited through multiple online and in-person outreach channels, including social media campaigns, community organizations, territorial partnerships, LGBTI+ events, and direct fieldwork.
Data collection was conducted between June 12 and September 12, 2025. During this period, approximately 18,000 survey submissions were received. Following data validation, quality-control procedures, and eligibility verification, 11,231 completed interviews were retained for analysis.
Given the recruitment strategy, the study does not report a survey response rate in the conventional statistical sense.
The survey employed post-stratification weighting and matching procedures to improve comparability between the non-probability LGBTI+ sample and the general Brazilian population. Because the survey relied on purposive recruitment and no sampling frame exists for LGBTI+ populations in Brazil, survey weights were not designed to produce nationally representative estimates of the LGBTI+ population. Instead, weighting was used to construct a statistically comparable counterfactual population for estimating labor market disparities and economic costs of exclusion.
The weighting procedure used PNAD Contínua 2024 as the reference population. A Covariate Balancing Propensity Score (CBPS) approach was first applied to assign weights to LGBTI+ respondents so that their distribution of observable characteristics aligned with the general population. Matching variables included age, sex, race/color, educational attainment, and state of residence.
Following CBPS estimation, an iterative proportional fitting (raking) procedure was applied to calibrate weights and ensure alignment with PNAD population margins for key demographic characteristics. This process adjusted the LGBTI+ sample to mirror the demographic composition of the Brazilian adult population while retaining all survey observations.
The final weighted dataset contains all 11,231 respondents. The weighting process produced an effective sample size (ESS) of approximately 5,288 observations, reflecting the variance introduced by weighting. This effective sample size should not be interpreted as a reduction in the number of observations but rather as a measure of weighting efficiency.
Weighted estimates were used for the economic and fiscal cost models and for comparisons between the LGBTI+ sample and the general population. Unweighted data were used when describing experiences of discrimination and exclusion within the LGBTI+ sample itself.
The survey questionnaire consists of the following modules:
Several additional processing steps were undertaken to prepare the data for analysis. Survey responses were harmonized and recoded to ensure consistency across collection modes (CAWI and CAPI) and to align key labor-market indicators with definitions used in Brazil’s official statistics, particularly PNAD Contínua.
Derived variables were constructed for employment status, unemployment, labor-force participation, informality, income, educational attainment, demographic characteristics, and experiences of workplace discrimination. Open-ended responses were reviewed and coded into standardized analytical categories where applicable.
To improve comparability between the LGBTI+ survey sample and the general Brazilian population, post-survey adjustment procedures were implemented. These included Covariate Balancing Propensity Score (CBPS) weighting followed by iterative proportional fitting (raking) calibrated to PNAD Contínua population distributions for age, sex, race/color, education, and state of residence. All observations were retained following the weighting process.
Additional analytical datasets were created by integrating survey data with reference population estimates derived from PNAD Contínua 2024. These processed datasets were used to estimate labor-market disparities and the economic and fiscal costs associated with exclusion based on sexual orientation, gender identity and expression, and sex characteristics (SOGIESC).
Qualitative data from focus group discussions were transcribed, anonymized, coded, and organized into thematic categories to support interpretation and triangulation of quantitative findings.
| Start | End | Cycle |
|---|---|---|
| 2025-06-12 | 2025-09-12 | 3 months |
Lucas Bulgarelli, Hannah Marucci, Mariah Rafaela Silva
2025 (cross-sectional survey)
The survey collected information on respondents' current socioeconomic and labor-market conditions, as well as retrospective information referring to different reference periods. Labor-force participation and employment questions referred primarily to the reference week preceding the interview; job-search activities referred to the four weeks preceding the interview; income-related questions generally referred to the previous 12 months; and workplace discrimination questions referred to experiences occurring during the previous five years.
Data editing was conducted throughout and after data collection to ensure data quality, consistency, and eligibility. Automated validation rules, skip patterns, range checks, and logical consistency checks were embedded in the questionnaire through the Survey Solutions platform to prevent invalid or contradictory responses during data collection.
Following fieldwork, records were reviewed for completeness, eligibility, duplication, and internal consistency. Eligibility criteria required respondents to be at least 18 years of age, reside in Brazil, and identify as part of the LGBTI+ population according to the survey screening questions. Incomplete interviews, ineligible cases, duplicate submissions, and records presenting substantial logical inconsistencies were excluded from the final analytical dataset.
Paradata, including interview duration, completion patterns, pauses, and navigation history, were reviewed to identify potentially low-quality interviews. Responses flagged as unreliable, such as interviews completed in unusually short periods or records exhibiting implausible response patterns, were subjected to additional review and excluded when appropriate.
Subsequent editing procedures included harmonization of variables, coding of open-ended responses, construction of derived indicators, preparation of analytical variables, and implementation of weighting and matching procedures used in the economic and fiscal cost analyses. The final edited dataset retained 11,231 eligible respondents.
Conventional estimates of sampling error are not available because the survey employed a purposive, non-probability sampling design and no sampling frame exists for the LGBTI+ population in Brazil. Consequently, standard design-based measures such as margins of error, confidence intervals derived from probability sampling, and sampling variances are not applicable.
To improve comparability with the general Brazilian population, the survey data were adjusted using Covariate Balancing Propensity Score (CBPS) weighting and iterative raking procedures calibrated to PNAD Contínua population benchmarks. While these adjustments reduce observable differences between the survey sample and the reference population, they do not permit the calculation of traditional sampling errors.
Accordingly, survey estimates should be interpreted as model-based estimates derived from a weighted and matched non-probability sample.
Multiple procedures were implemented to assess and enhance data quality throughout the study. During data collection, sample composition was continuously monitored and compared against demographic benchmarks derived from the 2022 Population Census, PNAD Contínua, and the National Health Survey (PNS). Weekly reviews were conducted to identify representation gaps and guide targeted outreach efforts to underrepresented groups and territories.
The Survey Solutions platform generated paradata, including interview duration, completion patterns, pauses, and navigation history, which were reviewed to identify potentially low-quality interviews. Automated validation rules, skip patterns, and consistency checks were embedded in the questionnaire to reduce reporting errors and logical inconsistencies during data collection.
Following fieldwork, the quality of the survey data was assessed through eligibility verification, internal consistency checks, examination of response distributions, and comparisons with official demographic statistics. Weighting diagnostics were also conducted to evaluate the effectiveness of the Covariate Balancing Propensity Score (CBPS) and raking procedures used to align the survey sample with PNAD Contínua population benchmarks.
In addition, the study employed a mixed-methods design that allowed for triangulation between quantitative survey findings and qualitative evidence collected through 16 focus group discussions. This approach provided an additional layer of validation by assessing whether observed statistical patterns were consistent with participants’ lived experiences and narratives regarding labor market exclusion, discrimination, and economic inclusion.
Koehler, D., Silva, M. R. C. G. da, Flores, A., Lena, F. F. de, Silva, S. A. G. da, Oliveira, A. M. H. C. de, & Bulgarelli, L. (2026). The Economic Cost Of Exclusion Based On Sexual Orientation, Gender Identity And Expression, And Sex Characteristics In The Labor Market In Brazil. World Bank.
This work is a product of The World Bank. The findings, interpretations, and conclusions expressed in this work do not necessarily reflect the views of the Executive Directors of The World Bank or the governments they represent.
The World Bank does not guarantee the accuracy, completeness, or currency of the data included in this work and does not assume responsibility for any errors, omissions, or discrepancies in the information, or liability with respect to the use of or failure to use the information, methods, processes, or conclusions set forth. The boundaries, colors, denominations, links/footnotes and other information shown in this work do not imply any judgment on the part of The World Bank concerning the legal status of any territory or the endorsement or acceptance of such boundaries. The citation of works authored by others does not mean the World Bank endorses the views expressed by those authors or the content of their works.
Nothing herein shall constitute or be construed or considered to be a limitation upon or waiver of the privileges and immunities of The World Bank, all of which are specifically reserved.
The material in this work is subject to copyright. Because The World Bank encourages dissemination of its knowledge, this work may be reproduced, in whole or in part, for noncommercial purposes as long as full attribution to this work is given.
| Name | Affiliation | |
|---|---|---|
| Dominik Koehler | TTL | dkoehler1@worldbank.org |
| Mariah Rafaela Cordeiro Gonzaga da Silva | Consultant | mcordeirogonzaga@worldbank.org |
DDI_BRA_2025_ECoE-SOGIESC_v01_M
| Name | Abbreviation | Affiliation | Role |
|---|---|---|---|
| Development Data Group | DECDG | World Bank Group | Documentation of the survey |
Version 01 (June 2026)
This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser. To learn more about cookies, click here.