Hyperparameters are model parameters that are specified before training a model – i.e., parameters that are different from model parameters – or weights that an AI/ML model learns during model training.
For many machine learning problems, finding the best hyperparameters is an iterative and potentially time-intensive process called “hyperparameter optimization.”
Examples of hyperparameters include the number of hidden layers and the learning rate of deep neural network algorithms, the number of leaves and depth of trees in decision tree algorithms, and the number of clusters in clustering algorithms.
Hyperparameters directly impact the performance of a trained machine learning model. Choosing the right hyperparameters can dramatically improve prediction accuracy. However, they can be challenging to optimize because there is often a large combination of possible hyperparameter values.
To address the challenge of hyperparameter optimization, data scientists use specific optimization algorithms designed for this task. Examples of hyperparameter optimization algorithms are grid search, random search, and Bayesian optimization. These optimization approaches help narrow the search space of all possible hyperparameter combinations to find the best (or near best) result. Hyperparameter optimization is also a critical area where the data scientist’s experience and intuition matter.