Introduction to the Keras Tuner

简介: The Keras Tuner is a library that helps you pick the optimal set of hyperparameters for your TensorFlow program. The process of selecting the right set of hyperparameters for your machine learning (ML) application is called hyperparameter tuning or hypertuning.

Overview

The Keras Tuner is a library that helps you pick the optimal set of hyperparameters for your TensorFlow program. The process of selecting the right set of hyperparameters for your machine learning (ML) application is called hyperparameter tuning or hypertuning.

Hyperparameters are the variables that govern the training process and the topology of an ML model. These variables remain constant over the training process and directly impact the performance of your ML program. Hyperparameters are of two types:

  1. Model hyperparameters which influence model selection such as the number and width of hidden layers
  2. Algorithm hyperparameters which influence the speed and quality of the learning algorithm such as the learning rate for Stochastic Gradient Descent (SGD) and the number of nearest neighbors for a k Nearest Neighbors (KNN) classifier

In this tutorial, you will use the Keras Tuner to perform hypertuning for an image classification application.

Setup

import tensorflow as tf
from tensorflow import keras
import keras_tuner as kt

Download and prepare the dataset

In this tutorial, you will use the Keras Tuner to find the best hyperparameters for a machine learning model that classifies images of clothing from the Fashion MNIST dataset.

Load the data.

(img_train, label_train), (img_test, label_test) = keras.datasets.fashion_mnist.load_data()
# Normalize pixel values between 0 and 1
img_train = img_train.astype('float32') / 255.0
img_test = img_test.astype('float32') / 255.0

Define the model

When you build a model for hypertuning, you also define the hyperparameter search space in addition to the model architecture. The model you set up for hypertuning is called a hypermodel.

You can define a hypermodel through two approaches:

  • By using a model builder function
  • By subclassing the HyperModel class of the Keras Tuner API

You can also use two pre-defined HyperModel classes - HyperXception and HyperResNet for computer vision applications.

In this tutorial, you use a model builder function to define the image classification model. The model builder function returns a compiled model and uses hyperparameters you define inline to hypertune the model.

def model_builder(hp):
  model = keras.Sequential()
  model.add(keras.layers.Flatten(input_shape=(28, 28)))

  # Tune the number of units in the first Dense layer
  # Choose an optimal value between 32-512
  hp_units = hp.Int('units', min_value=32, max_value=512, step=32)
  model.add(keras.layers.Dense(units=hp_units, activation='relu'))
  model.add(keras.layers.Dense(10))

  # Tune the learning rate for the optimizer
  # Choose an optimal value from 0.01, 0.001, or 0.0001
  hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

  model.compile(optimizer=keras.optimizers.Adam(learning_rate=hp_learning_rate),
                loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['accuracy'])

  return model

Instantiate the tuner and perform hypertuning

Instantiate the tuner to perform the hypertuning. The Keras Tuner has four tuners available - RandomSearch, Hyperband, BayesianOptimization, and Sklearn. In this tutorial, you use the Hyperband tuner.

To instantiate the Hyperband tuner, you must specify the hypermodel, the objective to optimize and the maximum number of epochs to train (max_epochs).

tuner = kt.Hyperband(model_builder,
                     objective='val_accuracy',
                     max_epochs=10,
                     factor=3,
                     directory='my_dir',
                     project_name='intro_to_kt')

The Hyperband tuning algorithm uses adaptive resource allocation and early-stopping to quickly converge on a high-performing model. This is done using a sports championship style bracket. The algorithm trains a large number of models for a few epochs and carries forward only the top-performing half of models to the next round. Hyperband determines the number of models to train in a bracket by computing 1 + logfactor(max_epochs) and rounding it up to the nearest integer.

Create a callback to stop training early after reaching a certain value for the validation loss.

stop_early = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=5)

Run the hyperparameter search. The arguments for the search method are the same as those used for tf.keras.model.fit in addition to the callback above.

tuner.search(img_train, label_train, epochs=50, validation_split=0.2, callbacks=[stop_early])

# Get the optimal hyperparameters
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]

print(f"""
The hyperparameter search is complete. The optimal number of units in the first densely-connected
layer is {best_hps.get('units')} and the optimal learning rate for the optimizer
is {best_hps.get('learning_rate')}.
""")

Train the model

Find the optimal number of epochs to train the model with the hyperparameters obtained from the search.

# Build the model with the optimal hyperparameters and train it on the data for 50 epochs
model = tuner.hypermodel.build(best_hps)
history = model.fit(img_train, label_train, epochs=50, validation_split=0.2)

val_acc_per_epoch = history.history['val_accuracy']
best_epoch = val_acc_per_epoch.index(max(val_acc_per_epoch)) + 1
print('Best epoch: %d' % (best_epoch,))

Re-instantiate the hypermodel and train it with the optimal number of epochs from above.

hypermodel = tuner.hypermodel.build(best_hps)

# Retrain the model
hypermodel.fit(img_train, label_train, epochs=best_epoch, validation_split=0.2)

To finish this tutorial, evaluate the hypermodel on the test data.

eval_result = hypermodel.evaluate(img_test, label_test)
print("[test loss, test accuracy]:", eval_result)

The my_dir/intro_to_kt directory contains detailed logs and checkpoints for every trial (model configuration) run during the hyperparameter search. If you re-run the hyperparameter search, the Keras Tuner uses the existing state from these logs to resume the search. To disable this behavior, pass an additional overwrite=True argument while instantiating the tuner.

代码地址: https://codechina.csdn.net/csdn_codechina/enterprise_technology/-/blob/master/Hypertuner/Introduction%20to%20the%20Keras%20Tuner.ipynb

目录
相关文章
|
11月前
|
TensorFlow 算法框架/工具
win11 + tensorflow 1.14 + keras 2.3.1 + bert4keras 0.9.7
win11 + tensorflow 1.14 + keras 2.3.1 + bert4keras 0.9.7
109 0
|
2月前
|
机器学习/深度学习 TensorFlow 算法框架/工具
【Tensorflow+Keras】tf.keras.layers.Bidirectional()的解析与使用
本文解析了TensorFlow和Keras中的`tf.keras.layers.Bidirectional()`层,它用于实现双向RNN(如LSTM、GRU)的神经网络结构。文章详细介绍了该层的参数配置,并通过实例演示了如何构建含有双向LSTM层的模型,以及如何使用IMDB数据集进行模型训练和评估。
50 8
|
5月前
|
机器学习/深度学习 TensorFlow API
TensorFlow的扩展库:TensorFlow Probability与TensorFlow Quantum
【4月更文挑战第17天】TensorFlow的扩展库TensorFlow Probability和TensorFlow Quantum开辟了机器学习和量子计算新纪元。TensorFlow Probability专注于概率推理和统计分析,集成深度学习,支持贝叶斯推断和变分推断,提供自动微分及丰富的概率模型工具。其Bijector组件允许复杂随机变量转换,增强建模能力。另一方面,TensorFlow Quantum结合量子计算与深度学习,处理量子数据,构建量子-经典混合模型,应用于化学模拟、量子控制等领域,内置量子计算基元和高性能模拟器。
|
机器学习/深度学习 数据可视化 数据挖掘
PyTorch Geometric (PyG) 入门教程
PyTorch Geometric是PyTorch1的几何图形学深度学习扩展库。本文旨在通过介绍PyTorch Geometric(PyG)中常用的方法等内容,为新手提供一个PyG的入门教程。
PyTorch Geometric (PyG) 入门教程
|
机器学习/深度学习 传感器 自然语言处理
论文笔记:SpectralFormer Rethinking Hyperspectral Image Classification With Transformers_外文翻译
 高光谱(HS)图像具有近似连续的光谱信息,能够通过捕获细微的光谱差异来精确识别物质。卷积神经网络(CNNs)由于具有良好的局部上下文建模能力,在HS图像分类中是一种强有力的特征提取器。然而,由于其固有的网络骨干网的限制,CNN不能很好地挖掘和表示谱特征的序列属性。
157 0
|
存储 机器学习/深度学习 PyTorch
PyG学习笔记1-INTRODUCTION BY EXAMPLE(一)
PyG学习笔记1-INTRODUCTION BY EXAMPLE(一)
295 0
PyG学习笔记1-INTRODUCTION BY EXAMPLE(一)
|
机器学习/深度学习 TensorFlow 算法框架/工具
TensorFlow HOWTO 1.3 逻辑回归
TensorFlow HOWTO 1.3 逻辑回归
79 0
|
TensorFlow 算法框架/工具
TensorFlow HOWTO 1.1 线性回归
TensorFlow HOWTO 1.1 线性回归
38 0
|
计算机视觉
PyG学习笔记1-INTRODUCTION BY EXAMPLE(二)
PyG学习笔记1-INTRODUCTION BY EXAMPLE(二)
148 0
|
编解码 算法 TensorFlow
TensorFlow笔记--Deep Dream模型(下)
TensorFlow笔记--Deep Dream模型
147 0
TensorFlow笔记--Deep Dream模型(下)