* Note: For smaller datasets (10,000 or fewer data points), there may be a sacrifice in accuracy when attempting to fit with early stopping. We don’t anticipate this to make a difference for users as the library is intended to speed up large training tasks with large datasets.
Let’s take a look at how it all works.
pip install tune-sklearn ray[tune] or
pip install tune-sklearn "ray[tune]" to get started with our example code below.
To start out, it’s as easy as changing our import statement to get Tune’s grid search cross validation interface:
And from there, we would proceed just like how we would in Scikit-Learn’s interface! Let’s use a “dummy” custom classification dataset and an
SGDClassifier to classify the data.
We choose the
SGDClassifier because it has a
partial_fit API, which enables it to stop fitting to the data for a certain hyperparameter configuration. If the estimator does not support early stopping, we would fall back to a parallel grid search.
As you can see, the setup here is exactly how you would do it for Scikit-Learn! Now, let’s try fitting a model.
Note the slight differences we introduced above:
- a new
- a specification of
early_stopping determines when to stop early — MedianStoppingRule is a great default but see Tune’s documentation on schedulers here for a full list to choose from.
max_iters is the maximum number of iterations a given hyperparameter set could run for; it may run for fewer iterations if it is early stopped.
Try running this compared to the GridSearchCV equivalent.
Other than the grid search interface, tune-sklearn also provides an interface, TuneSearchCV, for sampling from distributions of hyperparameters.
In addition, you can easily enable Bayesian optimization over the distributions in TuneSearchCV in only a few lines of code changes.
pip install scikit-optimize to try out this example:
As you can see, it’s very simple to integrate tune-sklearn into existing code. Check out more detailed examples and get started with tune-sklearn here and let us know what you think! Also take a look at Ray’s replacement for joblib, which allows users to parallelize training over multiple nodes, not just one node, further speeding up training.
*Note: importing from
ray.tune as shown in the linked documentation is available only on the nightly Ray wheels and will be available on pip soon