A Hands-on Guide To Create Explainable Gradient Boosting Classification models using Bayesian Hyperparameter Optimization.

Boosted decision tree algorithms, such as XGBoost, CatBoost, and LightBoost are popular methods for the classification task. Learn how to split the data, optimize hyperparameters, prevent overtraining, select the best-performing model, and create explainable results.

Erdogan Taskesen

--

Photo by Steve Harvey on Unsplash

This blog is written in a series where the first part explains the general concepts of gradient boosting techniques such as XGBoost, CatBoost, and LightBoost, together with the process of tuning hyperparameters, and details about the HGBoost library. In this part 2, I will demonstrate in more detail: 1. how to train a gradient boosting classification model with optimized hyperparameters using Bayesian optimization, 2. how to select the best performing model (and is not overtrained), 3. how to create explainable results by visually explaining the optimized hyperparameter space together with the model performance accuracy.

If you found this article helpful, use my referral link to continue learning without limits and sign up for a Medium membership. Plus, follow me to stay up-to-date with my latest content!

A brief introduction.

Gradient boosting algorithms such as Extreme Gradient Boosting (XGboost), Light Gradient Boosting (Lightboost), and CatBoost are powerful ensemble machine learning algorithms for predictive modeling that can be applied on tabular and continuous data, and for both classification and regression tasks [1,2,3]. Here I will focus on the classification task. If you need more background or are not entirely familiar with some of the concepts, I recommend reading A Guide to Find the Best Boosting Model using Bayesian Hyperparameter Tuning but without Overfitting. Before we go to the hands-on example, I will first briefly discuss the HGBoost library [4] because we will use this single library to do all the tasks.

The Hyperoptimized Gradient Boosting library.

--

--

Erdogan Taskesen

Machine Learning | Statistics | D3js visualizations | Data Science | Ph.D | erdogant.github.io