Research
- Crime Categories
- Murder Circumstances
- Charges
- Murder Numbers by SHR
- Definitions of Murder
- Crime Literature
- Other Literature
- Seminars
- Journal Ranking
- Laws
- Changes in Law and Reporting in Michigan
- Citation Guides
- Datasets
Writing
Methods
- BLP
- Econometrics Models
- Econometrics Tests
- Econometrics Resources
- Event Study Plots
- Metrics Literature
- Machine Learning
Python-related
- Python Basic Commands
- Pandas Imports and Exports
- Pandas Basic Commands
- Plotting in Python
- Python web scraping sample page
- Two Sample t Test in Python
- Modeling in Python
R-related
- R Basics
- R Statistics Basics
- RStudio Basics
- R Graphics
- R Programming
- Accessing MySQL Databases from R
Latex-related
Stata-related
SQL
Github
Linux-related
Conda-related
AWS-related
Webscraping
Interview Prep
Other
Machine Learning Algorithms
Machine learning has been widely adopted to predict outcomes based on existing data. This blog post links to resources on the most common machine learning algorithms.
Table of Contents
Random Forest
Splitting criterion minimizes the Gini impurity (in the case of classification) and the SSE (in case of regression). (Reference)
Splitting criterion optimizes for finding splits associated with group heterogeneity.
Random Forest in Python (by Will Koehrsen)
Grid Search Cross Validation (by Will Koehrsen)
Causal Random Forest
Splitting criterion optimizes for finding splits associated with treatment effect heterogeneity.
Oblique Regression Tree
Cattaneo, Chandak, and Klusowski (2022): Convergence Rates of Oblique Regression Trees for Flexible Function Libraries
- Splits are based on linear combinations of the covariates
- Oracle inequality allows them to be compared with projection pursuit regression and newral networks
- Under suitable conditions, oblique decision trees achieve similar predictive accuracy as neural networks for the same library of regression models - we do not need to always trade-off interpretability with accuracy