optuna.pruners¶

class optuna.pruners.BasePruner[source]¶

Base class for pruners.

abstract prune(study, trial)[source]¶

Judge whether the trial should be pruned based on the reported values.

Note that this method is not supposed to be called by library users. Instead, optuna.trial.Trial.report() and optuna.trial.Trial.should_prune() provide user interfaces to implement pruning mechanism in an objective function.

Parameters

study – Study object of the target study.
trial – FrozenTrial object of the target trial. Take a copy before modifying this object.

Returns

A boolean value representing whether the trial should be pruned.

class optuna.pruners.MedianPruner(n_startup_trials=5, n_warmup_steps=0, interval_steps=1)[source]¶

Pruner using the median stopping rule.

Prune if the trial’s best intermediate result is worse than median of intermediate results of previous trials at the same step.

Example

We minimize an objective function with the median stopping rule.

import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split

import optuna

X, y = load_iris(return_X_y=True)
X_train, X_valid, y_train, y_valid = train_test_split(X, y)
classes = np.unique(y)

def objective(trial):
    alpha = trial.suggest_uniform('alpha', 0.0, 1.0)
    clf = SGDClassifier(alpha=alpha)
    n_train_iter = 100

    for step in range(n_train_iter):
        clf.partial_fit(X_train, y_train, classes=classes)

        intermediate_value = clf.score(X_valid, y_valid)
        trial.report(intermediate_value, step)

        if trial.should_prune():
            raise optuna.TrialPruned()

    return clf.score(X_valid, y_valid)

study = optuna.create_study(direction='maximize',
                            pruner=optuna.pruners.MedianPruner(n_startup_trials=5,
                                                               n_warmup_steps=30,
                                                               interval_steps=10))
study.optimize(objective, n_trials=20)

Parameters

n_startup_trials – Pruning is disabled until the given number of trials finish in the same study.
n_warmup_steps – Pruning is disabled until the trial exceeds the given number of step.
interval_steps – Interval in number of steps between the pruning checks, offset by the warmup steps. If no value has been reported at the time of a pruning check, that particular check will be postponed until a value is reported.

class optuna.pruners.NopPruner[source]¶

Pruner which never prunes trials.

Example

import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split

import optuna

X, y = load_iris(return_X_y=True)
X_train, X_valid, y_train, y_valid = train_test_split(X, y)
classes = np.unique(y)

def objective(trial):
    alpha = trial.suggest_uniform('alpha', 0.0, 1.0)
    clf = SGDClassifier(alpha=alpha)
    n_train_iter = 100

    for step in range(n_train_iter):
        clf.partial_fit(X_train, y_train, classes=classes)

        intermediate_value = clf.score(X_valid, y_valid)
        trial.report(intermediate_value, step)

        if trial.should_prune():
            assert False, "should_prune() should always return False with this pruner."
            raise optuna.TrialPruned()

    return clf.score(X_valid, y_valid)

study = optuna.create_study(direction='maximize',
                            pruner=optuna.pruners.NopPruner())
study.optimize(objective, n_trials=20)

class optuna.pruners.PercentilePruner(percentile, n_startup_trials=5, n_warmup_steps=0, interval_steps=1)[source]¶

Pruner to keep the specified percentile of the trials.

Prune if the best intermediate value is in the bottom percentile among trials at the same step.

Example

import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split

import optuna

X, y = load_iris(return_X_y=True)
X_train, X_valid, y_train, y_valid = train_test_split(X, y)
classes = np.unique(y)

def objective(trial):
    alpha = trial.suggest_uniform('alpha', 0.0, 1.0)
    clf = SGDClassifier(alpha=alpha)
    n_train_iter = 100

    for step in range(n_train_iter):
        clf.partial_fit(X_train, y_train, classes=classes)

        intermediate_value = clf.score(X_valid, y_valid)
        trial.report(intermediate_value, step)

        if trial.should_prune():
            raise optuna.TrialPruned()

    return clf.score(X_valid, y_valid)

study = optuna.create_study(
    direction='maximize',
    pruner=optuna.pruners.PercentilePruner(25.0, n_startup_trials=5,
                                           n_warmup_steps=30, interval_steps=10))
study.optimize(objective, n_trials=20)

Parameters

percentile – Percentile which must be between 0 and 100 inclusive (e.g., When given 25.0, top of 25th percentile trials are kept).
n_startup_trials – Pruning is disabled until the given number of trials finish in the same study.
n_warmup_steps – Pruning is disabled until the trial exceeds the given number of step.
interval_steps – Interval in number of steps between the pruning checks, offset by the warmup steps. If no value has been reported at the time of a pruning check, that particular check will be postponed until a value is reported. Value must be at least 1.

class optuna.pruners.SuccessiveHalvingPruner(min_resource='auto', reduction_factor=4, min_early_stopping_rate=0)[source]¶

Pruner using Asynchronous Successive Halving Algorithm.

Successive Halving is a bandit-based algorithm to identify the best one among multiple configurations. This class implements an asynchronous version of Successive Halving. Please refer to the paper of Asynchronous Successive Halving for detailed descriptions.

Note that, this class does not take care of the parameter for the maximum resource, referred to as \(R\) in the paper. The maximum resource allocated to a trial is typically limited inside the objective function (e.g., step number in simple.py, EPOCH number in chainer_integration.py).

Example

We minimize an objective function with SuccessiveHalvingPruner.

import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split

import optuna

X, y = load_iris(return_X_y=True)
X_train, X_valid, y_train, y_valid = train_test_split(X, y)
classes = np.unique(y)

def objective(trial):
    alpha = trial.suggest_uniform('alpha', 0.0, 1.0)
    clf = SGDClassifier(alpha=alpha)
    n_train_iter = 100

    for step in range(n_train_iter):
        clf.partial_fit(X_train, y_train, classes=classes)

        intermediate_value = clf.score(X_valid, y_valid)
        trial.report(intermediate_value, step)

        if trial.should_prune():
            raise optuna.TrialPruned()

    return clf.score(X_valid, y_valid)

study = optuna.create_study(direction='maximize',
                            pruner=optuna.pruners.SuccessiveHalvingPruner())
study.optimize(objective, n_trials=20)

Parameters

min_resource –
A parameter for specifying the minimum resource allocated to a trial (in the paper this parameter is referred to as \(r\)). This parameter defaults to ‘auto’ where the value is determined based on a heuristic that looks at the number of required steps for the first trial to complete.

A trial is never pruned until it executes \(\mathsf{min}\_\mathsf{resource} \times \mathsf{reduction}\_\mathsf{factor}^{ \mathsf{min}\_\mathsf{early}\_\mathsf{stopping}\_\mathsf{rate}}\) steps (i.e., the completion point of the first rung). When the trial completes the first rung, it will be promoted to the next rung only if the value of the trial is placed in the top \({1 \over \mathsf{reduction}\_\mathsf{factor}}\) fraction of the all trials that already have reached the point (otherwise it will be pruned there). If the trial won the competition, it runs until the next completion point (i.e., \(\mathsf{min}\_\mathsf{resource} \times \mathsf{reduction}\_\mathsf{factor}^{ (\mathsf{min}\_\mathsf{early}\_\mathsf{stopping}\_\mathsf{rate} + \mathsf{rung})}\) steps) and repeats the same procedure.

Note

If the step of the last intermediate value may change with each trial, please manually specify the minimum possible step to min_resource.
reduction_factor –
A parameter for specifying reduction factor of promotable trials (in the paper this parameter is referred to as \(\eta\)). At the completion point of each rung, about \({1 \over \mathsf{reduction}\_\mathsf{factor}}\) trials will be promoted.
min_early_stopping_rate –
A parameter for specifying the minimum early-stopping rate (in the paper this parameter is referred to as \(s\)).

class optuna.pruners.HyperbandPruner(min_resource: int = 1, max_resource: Union[str, int] = 'auto', reduction_factor: int = 3)[source]¶

Pruner using Hyperband.

As SuccessiveHalving (SHA) requires the number of configurations \(n\) as its hyperparameter. For a given finite budget \(B\), all the configurations have the resources of \(B \over n\) on average. As you can see, there will be a trade-off of \(B\) and \(B \over n\). Hyperband attacks this trade-off by trying different \(n\) values for a fixed budget.

Note

In the Hyperband paper, the counterpart of RandomSampler is used.
Optuna uses TPESampler by default.
The benchmark result shows that optuna.pruners.HyperbandPruner supports both samplers.

Note

If you use HyperbandPruner with TPESampler, it’s recommended to consider to set larger n_trials or timeout to make full use of the characteristics of TPESampler because TPESampler uses some (by default, \(10\)) Trials for its startup.

As Hyperband runs multiple SuccessiveHalvingPruner and collect trials based on the current Trial’s bracket ID, each bracket needs to observe more than \(10\) Trials for TPESampler to adapt its search space.

Thus, for example, if HyperbandPruner has \(4\) pruners in it, at least \(4 \times 10\) trials are consumed for startup.

Note

Hyperband has several SuccessiveHalvingPruner. Each SuccessiveHalvingPruner is referred as “bracket” in the original paper. The number of brackets is an important factor to control the early stopping behavior of Hyperband and is automatically determined by min_resource, max_resource and reduction_factor as The number of brackets = floor(log_{reduction_factor}(max_resource / min_resource)) + 1. Please set reduction_factor so that the number of brackets is not too large　(about 4 ~ 6 in most use cases).　Please see Section 3.6 of the original paper for the detail.

Example

We minimize an objective function with Hyperband pruning algorithm.

import numpy as np
from sklearn.datasets import load_iris
from sklearn.linear_model import SGDClassifier
from sklearn.model_selection import train_test_split

import optuna

X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
classes = np.unique(y)
n_train_iter = 100

def objective(trial):
    alpha = trial.suggest_uniform('alpha', 0.0, 1.0)
    clf = SGDClassifier(alpha=alpha)

    for step in range(n_train_iter):
        clf.partial_fit(X_train, y_train, classes=classes)

        intermediate_value = clf.score(X_valid, y_valid)
        trial.report(intermediate_value, step)

        if trial.should_prune():
            raise optuna.TrialPruned()

    return clf.score(X_valid, y_valid)

study = optuna.create_study(
    direction='maximize',
    pruner=optuna.pruners.HyperbandPruner(
        min_resource=1,
        max_resource=n_train_iter,
        reduction_factor=3
    )
)
study.optimize(objective, n_trials=20)

Parameters

min_resource – A parameter for specifying the minimum resource allocated to a trial noted as \(r\) in the paper. A smaller \(r\) will give a result faster, but a larger \(r\) will give a better guarantee of successful judging between configurations. See the details for SuccessiveHalvingPruner.
max_resource –
A parameter for specifying the maximum resource allocated to a trial. \(R\) in the paper corresponds to max_resource / min_resource. This value represents and should match the maximum iteration steps (e.g., the number of epochs for neural networks). When this argument is “auto”, the maximum resource is estimated according to the completed trials. The default value of this argument is “auto”.

Note

With “auto”, the maximum resource will be the largest step reported by report() in the first, or one of the first if trained in parallel, completed trial. No trials will be pruned until the maximum resource is determined.

Note

If the step of the last intermediate value may change with each trial, please manually specify the maximum possible step to max_resource.
reduction_factor – A parameter for specifying reduction factor of promotable trials noted as \(\eta\) in the paper. See the details for SuccessiveHalvingPruner.

Note

Added in v1.1.0 as an experimental feature. The interface may change in newer versions without prior notice. See https://github.com/optuna/optuna/releases/tag/v1.1.0.

class optuna.pruners.ThresholdPruner(lower: Optional[float] = None, upper: Optional[float] = None, n_warmup_steps: int = 0, interval_steps: int = 1)[source]¶

Pruner to detect outlying metrics of the trials.

Prune if a metric exceeds upper threshold, falls behind lower threshold or reaches nan.

Example

from optuna import create_study
from optuna.pruners import ThresholdPruner
from optuna import TrialPruned

def objective_for_upper(trial):
    for step, y in enumerate(ys_for_upper):
        trial.report(y, step)

        if trial.should_prune():
            raise TrialPruned()
    return ys_for_upper[-1]


def objective_for_lower(trial):
    for step, y in enumerate(ys_for_lower):
        trial.report(y, step)

        if trial.should_prune():
            raise TrialPruned()
    return ys_for_lower[-1]


ys_for_upper = [0.0, 0.1, 0.2, 0.5, 1.2]
ys_for_lower = [100.0, 90.0, 0.1, 0.0, -1]
n_trial_step = 5

study = create_study(pruner=ThresholdPruner(upper=1.0))
study.optimize(objective_for_upper, n_trials=10)

study = create_study(pruner=ThresholdPruner(lower=0.0))
study.optimize(objective_for_lower, n_trials=10)

Args

lower:: A minimum value which determines whether pruner prunes or not. If an intermediate value is smaller than lower, it prunes.
upper:: A maximum value which determines whether pruner prunes or not. If an intermediate value is larger than upper, it prunes.
n_warmup_steps:: Pruning is disabled until the trial exceeds the given number of step.
interval_steps:: Interval in number of steps between the pruning checks, offset by the warmup steps. If no value has been reported at the time of a pruning check, that particular check will be postponed until a value is reported. Value must be at least 1.