A rigorous end-to-end machine learning study completed at Harvard, applying 10+ algorithms to real-world datasets — from cancer diagnosis to movie recommendations.
Performance highlights across all models and datasets
sweep(). The most important predictor was area_worst.
Real outputs generated during the analysis — click to expand and read the explanation
The most important insights from applying ML theory to real datasets
sweep(), KNN at k=9 outperformed all individual models and even the 4-model ensemble on cancer diagnosis.All methods applied, tuned, and evaluated with proper cross-validation
area_worst in BRCA.rpart. Tuned with complexity parameter cp.