The documented dataset covers Enterprise Survey (ES) panel data collected in Argentina in 2006, 2010 and 2017, as part of the Enterprise Survey initiative of the World Bank. An Indicator Survey is similar to an Enterprise Survey; it is implemented for smaller economies where the sampling strategies inherent in an Enterprise Survey are often not applicable due to the limited universe of firms.
The objective of the 2006-2017 Enterprise Survey is to obtain feedback from enterprises in client countries on the state of the private sector as well as to build a panel of enterprise data that will make it possible to track changes in the business environment over time and allow, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the Indicator Survey data provides information on the constraints to private sector growth and is used to create statistically significant business environment indicators that are comparable across countries.
As part of its strategic goal of building a climate for investment, job creation, and sustainable growth, the World Bank has promoted improving the business environment as a key strategy for development, which has led to a systematic effort in collecting enterprise data across countries. The Enterprise Surveys (ES) are an ongoing World Bank project in collecting both objective data based on firms' experiences and enterprises' perception of the environment in which they operate.
Kind of data
Sample survey data [ssd]
v01, edited, anonymous dataset for public distribution
The Enterprise Surveys panel datasets have the following common format:
• Variable panel allows easy identification of panel observations
• Variable panelid is the same across the waves for the same firm
• Variable eligibility<year> reports eligibility status of all firms interviewed in the previous wave as of the <year> of the latest wave
e.g. in 2013-2016 panel, eligibility 2016 reports status as of 2016 of all firms interviewed in 2013
• Wherever possible variables are matched across waves. If needed, matches are made by converting variable names in older waves to variable names in the most recent wave
• Due to methodological changes and evolution of the survey instrument it is not possible to match all variables in the datasets
• Variables that are not matched across waves are named as _<year>_<variable>, with the year in which the variable was collected (e.g. _2013_date)
• It is recommended that users thoroughly familiarize themselves with the questionnaires from each of the years contained in the dataset before proceeding with analysis
• Some monetary unit variables in 2002 and 2005 surveys (in US currency) are converted into the local currency units (LCU) using the market, period average, exchange rates. The sources of the exchange rates are the International Financial Statistics (IFS - IMF) websites.
• Weights are representative of the universe for the year that the firm was interviewed. They are not panel weights.
Regions covered are selected based on the number of establishments, contribution to employment, and value added. In most cases these regions are metropolitan areas and reflect the largest centers of economic activity in a country.
Unit of analysis
The primary sampling unit of the study is the establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors.
Producers and sponsors
The sample for the 2006-2017 Argentina Enterprise Survey (ES) was selected using stratified random sampling, following the methodology explained in the Sampling Manual. Stratified random sampling was preferred over simple random sampling for several reasons:
- To obtain unbiased estimates for different subdivisions of the population with some known level of precision.
- To obtain unbiased estimates for the whole population. The whole population, or universe of the study, is the non-agricultural economy. It comprises: all manufacturing sectors (group D), construction (group F), services (groups G and H), and transport, storage, and communications (group I). Groups are defined following ISIC revision 3.1. Note that this definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, excluding sub-sector 72, IT, which was added to the population under study), and all public or utilities-sectors.
- To make sure that the final total sample includes establishments from all different sectors and that it is not concentrated in one or two of industries/sizes/regions.
- To exploit the benefits of stratified sampling where population estimates, in most cases, will be more precise than using a simple random sampling method (i.e., lower standard errors, other things being equal.)
Three levels of stratification were used in every country: industry, establishment size, and region.
Industry stratification was designed in the following way: In small economies the population was stratified into 3 manufacturing industries, one services industry - retail-, and one residual sector as defined in the sampling manual. Each industry had a target of 120 interviews. In middle size economies the population was stratified into 4 manufacturing industries, 2 services industries -retail and IT-, and one residual sector. For the manufacturing industries sample sizes were inflated by 25% to account for potential non-response in the financing data.
For the Argentina ES, size stratification was defined following the standardized definition for the rollout: small (5 to 19 employees), medium (20 to 99 employees), and large (more than 99 employees). For stratification purposed, the number of employees was defined on the basis of reported permanent full-time workers. This resulted in some difficulties in certain countries where seasonal/casual/part-time labor is common.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies:
a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect the refusal to respond (-8) as a different option from don't know (-9).
b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary. However, there were clear cases of low response. The following graph shows non-response rates for the sales variable, d2, by sector. Please, note that for this specific question, refusals were not separately identified from "Don't know" responses.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals; whenever this was done, strict rules were followed to ensure replacements were randomly selected within the same stratum. Further research is needed on survey non-response in the Enterprise Surveys regarding potential introduction of bias.
Since the sampling design was stratified and employed differential sampling, individual observations should be properly weighted when making inferences about the population. Under stratified random sampling, unweighted estimates are biased unless sample sizes are proportional to the size of each stratum. With stratification, the probability of selection of each unit is, in general, not the same. Consequently, individual observations must be weighted by the inverse of their probability of selection (probability weights or pw in Stata.)
For some units it was impossible to determine eligibility because the contact was not successfully completed. Consequently, different assumptions as to their eligibility result in different universe cells' adjustments and in different sampling weights. Three sets of assumptions were considered:
a- Strict assumption: eligible establishments are only those for which it was possible to directly determine eligibility. The resulting weights are included in the variable w_strict.
b- Median assumption: eligible establishments are those for which it was possible to directly determine eligibility and those that rejected the screener questionnaire or an answering machine or fax was the only response. The resulting weights are included in the variable w_median. Median weights are used for computing indicators on the www.enterprisesurveys.org website.
c- Weak assumption: in addition to the establishments included in points a and b, all establishments for which it was not possible to finalize a contact are assumed eligible. This includes establishments with dead or out of service phone lines, establishments that never answered the phone, and establishments with incorrect addresses for which it was impossible to find a new address. The resulting weights are included in the variable w_weak. Note that under the weak assumption only observed non-eligible units are excluded from universe projections.
The "Core Questionnaire" is the heart of the Enterprise Survey and contains the survey questions asked of all firms across the world. There are also two other survey instruments - the "Core Questionnaire + Manufacturing Module" and the "Core Questionnaire + Retail Module." The survey is fielded via three instruments in order to not ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs/labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Confidentiality of the survey respondents and the sensitive information they provide is necessary to ensure the greatest degree of survey participation, integrity and confidence in the quality of the data. Surveys are usually carried out in cooperation with business organizations and government agencies promoting job creation and economic growth, but confidentiality is never compromised.
Aggregate indicators based on Enterprise Survey data are available to the public at https://www.enterprisesurveys.org
Firm-level data is also available to the public free-of-charge. In order to access the firm-level data, users must agree to abide by a strict confidentiality agreement available through Enterprise Analysis Unit website by clicking on "External users register here" at https://www.enterprisesurveys.org/Portal
Use of the dataset must be acknowledged using a citation which would include:
- the Identification of the Primary Investigator
- the title of the survey (including country, acronym and year of implementation)
- the survey reference number
- the source and date of download
The World Bank. Argentina Enterprise Survey (ES-P) 2006-2017, Panel Data, Ref. ARG_2006-2017_ES-P_v01_M. Dataset downloaded from [URL] on [date].
The user of the data acknowledges that the original collector of the data, the authorized distributor of the data, and the relevant funding agency bear no responsibility for use of the data or for interpretations or inferences based upon such uses.