*** Readme: Replication files for "Predicting Entrepreneurial Success is Hard: Evidence from a Business Plan Competition in Nigeria"
**** David McKenzie and Dario Sansone

************************************************************************
*************** Data files *********************************************
************************************************************************

*** The Data, Questionnaires, and Supplementary Documentation are also available in the World Bank's Open Data Library:
http://microdata.worldbank.org/index.php/catalog/2329

BaselineandFirstFollowup.dta:  Baseline and first follow-up round data  
SecondFollowup.dta:  Second follow-up round data  
ThirdFollowup.dta:  Third follow-up round data 
regionuid.dta: data on the region of application for non-experimental sample used in RD analysis
VariablesRestrictedData.dta: additional variables that weren't included in the original public data release

************************************************************************
*************** Do files *********************************************
************************************************************************

The following files are in Stata format (version 14.1 unless noted otherwise)

A1_DataPreparation.do: generate the dataset used in the paper (MLNigeria_masterdata.dta)
A2_DescriptiveStats.do: replicate Figures 1-4
A2b_AddedValue.do: replicate Figure A1 and Table A5
A3_BusinessPlanvsSurvey.do: replicate Tables 1-3, A1-A4
A4_PredictionOLS.do: replicate OLS results in Tables 5-6, A10, A12-A15 - Employment, Sales, Profits
A4b_PredictionLogitBinary.do: replicate Logit/Probit results in Tables 4, A11, A15 - Survival
A4c_PredictionOLSTail.do: replicate OLS results in Table 7 Panels A-B
B1_PredictionSVM.do: replicate SVM/SVR results in Tables 5-6, A8, A10, A12-A14, A17 - Employment, Sales, Profits
B1b_PredictionSVMBinary.do: replicate SVM/SVR results in Tables 4, A8, A11, A17 - Survival
B2_PredictionLASSO.do: replicate LASSO results in Tables 5-6, A6-A7, A10, A12-A17 and Figures A3-A6 - Employment, Sales, Profits
B2b_PredictionLASSOBinary.do: replicate LASSO results in Tables 4, A6-A7, A11, A15-A17 and Figure A2 - Survival
B2c_PredictionLASSOVarSel.do: replicate Table A21
B3_PredictionBoosting.do: replicate Boost results in Tables 5-6, A9-A10, A12-A15, A17 - Employment, Sales, Profits
B3b_PredictionBoostingBinary.do: replicate Bosst results in Tables 4, A9, A11, A17 - Survival
B3c_PredictionBoostingVarSel.do: replicate Table A20 
B4_PredictionElasticNet.do: replicate Table A19 (Employment, Sales, Profits)
B4b_PredictionElasticNetBinary: replicate Table A19 (Survival)
B5_PredictionEnsemple: replicate Table A18
B6_PredictionMLTail: replicate ML results in Table 7 Panels A-B
B7_PortfolioSimulation: replicate Table 7 Panels C

Required additional Stata commands:

svmachines (http://schonlau.net/publication/16svm_stata.pdf)
elasticregress (https://ideas.repec.org/c/boc/bocode/s458397.html)
boosting (https://www.stata-journal.com/sjpdf.html?articlenum=st0087)
see also https://www.stata.com/stata-news/news33-4/users-corner/

