Skip to content
Search
Generic filters
Exact matches only

an example with Keras and TensorFlow 2.0

Next, let’s create X and y. Keras and TensorFlow 2.0 only take in Numpy array as inputs, so we will have to convert DataFrame back to Numpy array.

# Creating X and yX = df[['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']]
# Convert DataFrame into np array
X = np.asarray(X)
y = df[['label_setosa', 'label_versicolor', 'label_virginica']]
# Convert DataFrame into np array
y = np.asarray(y)

Finally, let’s split the dataset into a training set (80%)and a test set (20%) using train_test_split() from sklearn library.

X_train, X_test, y_train, y_test = train_test_split(
X,
y,
test_size=0.20

)

Great! our data is ready for building a Machine Learning model.

Before applying regularization, let’s build a neural network without regularization and take a look at the overfitting issue.

There are 3 ways to create a machine learning model with Keras and Tensorflow 2. Since we are building a simple fully connected Neural Network and for simplicity, let’s use the easiest way: Sequential Model with Sequential().

Let’s go ahead and create a function called create_model() to return a Sequential model.

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
def create_model():
model = Sequential([
Dense(64, activation='relu', input_shape=(4,)),
Dense(128, activation='relu'),
Dense(128, activation='relu'),
Dense(128, activation='relu'),
Dense(64, activation='relu'),
Dense(64, activation='relu'),
Dense(64, activation='relu'),
Dense(3, activation='softmax')
])
return model

Notices that

  • The first layer (also known as the input layer) has the input_shape to set the input size (4,)
  • The input layer has 64 units, followed by 3 dense layers, each with 128 units. Then there are further 3 dense layers, each with 64 units. All these layers use the ReLU activation function.
  • The output Dense layer has 3 units and the softmax activation function.

By running

model = create_model()
model.summary()

Should print out the model summary.

1.1 Training a model

In order to train a model, we first have to configure our model using compile() and pass the following arguments:

  • Use Adam (adam) optimization algorithm as the optimizer
  • Use categorical cross-entropy loss function (categorical_crossentropy) for our multiple-class classification problem
  • For simplicity, use accuracy as our evaluation metrics to evaluate the model during training and testing.
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']

)

After that, we can call model.fit() to fit our model to the training data.

history = model.fit(
X_train,
y_train,
epochs=200,
validation_split=0.25,
batch_size=40,

verbose=2
)

If all runs smoothly, we should get an output like below

Train on 90 samples, validate on 30 samples
Epoch 1/200
90/90 - 1s - loss: 1.0939 - accuracy: 0.4333 - val_loss: 1.0675 - val_accuracy: 0.5333
Epoch 2/200
90/90 - 0s - loss: 1.0553 - accuracy: 0.6556 - val_loss: 1.0160 - val_accuracy: 0.7000
......
......
Epoch 200/200
90/90 - 0s - loss: 0.0624 - accuracy: 0.9778 - val_loss: 0.1874 - val_accuracy: 0.9333

1.2 Model Evaluation

Once training is complete, it’s time to see if the model is any good with Model Evaluation. The Model Evaluation typically involves

  1. Plot the progress on loss and accuracy metrics
  2. Test our model against data that has never been used for training. This is where the test dataset X_test that we set aside earlier come to play.

Let’s create a function plot_metric() for plotting metrics.

%matplotlib inline
%config InlineBackend.figure_format = 'svg'
def plot_metric(history, metric):
train_metrics = history.history[metric]
val_metrics = history.history['val_'+metric]
epochs = range(1, len(train_metrics) + 1)
plt.plot(epochs, train_metrics)
plt.plot(epochs, val_metrics)
plt.title('Training and validation '+ metric)
plt.xlabel("Epochs")
plt.ylabel(metric)
plt.legend(["train_"+metric, 'val_'+metric])
plt.show()

By running plot_metric(history, 'accuracy') to plot the progress on accuracy.

By running plot_metric(history, 'loss') to plot the progress on loss.

From the above graph, we can see that the model has overfitted the training data, so it outperforms the validation set.

To evaluate the model on the test set

# Evaluate the model on the test set
model.evaluate(X_test, y_test, verbose=2)

And we should get an output like below

30/1 - 0s - loss: 0.0137 - accuracy: 1.0000
[0.01365612167865038, 1.0]

First, let’s import Dropout and L2 regularization from TensorFlow Keras package

from tensorflow.keras.layers import Dropout
from tensorflow.keras.regularizers import l2

Then, we create a function called create_regularized_model() and it will return a model similar to the one we built before. But, this time we will add L2 regularization and Dropout layers, so this function takes 2 arguments: a L2 regularization factor and a Dropout rate.

  • Let’s add L2 regularization in all layers except the output layer [1].
  • Let’s add Dropout layer between every two dense layers.
def create_regularized_model(factor, rate):
model = Sequential([
Dense(64, kernel_regularizer=l2(factor), activation="relu", input_shape=(4,)),
Dropout(rate),
Dense(128, kernel_regularizer=l2(factor), activation="relu"),
Dropout(rate),
Dense(128, kernel_regularizer=l2(factor), activation="relu"),
Dropout(rate),
Dense(128, kernel_regularizer=l2(factor), activation="relu"),
Dropout(rate),
Dense(64, kernel_regularizer=l2(factor), activation="relu"),
Dropout(rate),
Dense(64, kernel_regularizer=l2(factor), activation="relu"),
Dropout(rate),
Dense(64, kernel_regularizer=l2(factor), activation="relu"),
Dropout(rate),
Dense(3, activation='softmax')
])
return model

Let’s create the model with arguments L2 factor 0.0001 and Dropout rate 0.3

model = create_regularized_model(1e-5, 0.3)
model.summary()

2.1 Training

The regularized model can be trained just like the first model we built.

# First configure model using model.compile()
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy']

)
# Then, train the model with fit()
history = model.fit(
X_train,
y_train,
epochs=200,
validation_split=0.25,
batch_size=40,

verbose=2
)

If all runs smoothly, we should get an output like below

Train on 90 samples, validate on 30 samples
Epoch 1/200
90/90 - 2s - loss: 1.0855 - accuracy: 0.3333 - val_loss: 1.0873 - val_accuracy: 0.3000
Epoch 2/200
90/90 - 0s - loss: 1.0499 - accuracy: 0.3778 - val_loss: 1.0773 - val_accuracy: 0.3000
......
......
Epoch 200/200
90/90 - 0s - loss: 0.1073 - accuracy: 0.9556 - val_loss: 0.1766 - val_accuracy: 0.9000

2.2 Model Evaluation

Now, let’s plot the progress on loss

plot_metric(history, 'loss')

From the graph, we can see that the overfitting it not completely fixed, but there is a significant improvement when we compare it to the unregularized model.

And finally, to evaluate the model on the test set

model.evaluate(X_test, y_test, verbose=2)

Should output something like

30/1 - 0s - loss: 0.0602 - accuracy: 0.9667
[0.06016349419951439, 0.96666664]

Thanks for reading.

Please checkout the notebook on my Github for the source code.

Stay tuned if you are interested in the practical aspect of machine learning.