Decision Tree and Ensemble Methods
In this assignment you will implement decision trees and explore ensemble methods that combine multiple learners.
You will cover:
- Decision tree: Implement a decision tree classifier using information gain (entropy) or Gini impurity as the splitting criterion; control tree depth to study bias–variance tradeoff.
- Bagging / Random Forest: Train an ensemble of trees on bootstrapped subsets of the data and aggregate predictions by majority vote or averaging.
- Boosting: Implement AdaBoost or Gradient Boosting, where each new tree corrects the residual errors of the previous ensemble.
- Feature importance: Compute and visualize which features contribute most to predictions.
- Comparison: Evaluate a single decision tree against random forest and boosting on the same dataset and explain the performance differences.
Problems will be released closer to the due date.