<font size="5"><b>Lecture 12: Deep Learning DL2 (exercises)</b></font>

**HOW TO RUN THIS NOTEBOOK**:<br>

1.   <u>if TF installed locally</u><br>
⇒ activate the conda environment in which you installed Tensor Flow (e.g., "tf"), and launch jupyter & this notebook from it:<br>
```
$ conda activate tf
$ jupyter notebook
```
2.   <u>if using GoogleColab</u><br>
⇒ simply open this notebook<br>
NB: the TensorBoard notebook extension can be loaded from GoogleColab

**NOTA BENE**: some exercices are inspired by Aurélien Géron's book _"Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow"_

# Initialize

In [None]:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

# CNN from scratch (MNIST fashion dataset)

<div class="alert alert-warning">
    In this exercice, you will design and train <b>Convolutional Neural Network (CNN)</b> classify the MNIST-fashion dataset, just as you did with the <b>MLP</b> model.<br>
</div>

## load data

<div class="alert alert-success">
    Load the MNIST fashion dataset.<br>
    Split the full training dataset (images and labels) to have create a validation dataset of 5000 instances.
</div>

## preprocess data

<div class="alert alert-success">
    Preprocess the datasets:<br>
    1. compute the mean and standard deviation of the training dataset<br>
    2. scale the datasets (training, validation and test): on each dataset, substract the mean value, and divide by the standard deviation<br>
    3. add a dimension to each dataset array (so keras will know it is dealing with a single color channel image, i.e. grayscale)
</div>

## build model (using the Sequential API)

<div class="alert alert-success">
    Build a CNN with the following architecture.<br>
    Carefully look at the model. Explain the overall architecture (can you identify blocks?), and explain what each layer does.<br>
</div>

In [None]:
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 7, activation="relu", padding="same", input_shape=[28, 28, 1]),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Conv2D(128, 3, activation="relu", padding="same"),
    tf.keras.layers.Conv2D(128, 3, activation="relu", padding="same"),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Conv2D(256, 3, activation="relu", padding="same"),
    tf.keras.layers.Conv2D(256, 3, activation="relu", padding="same"),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation="relu"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(64, activation="relu"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(10, activation="softmax")
])
model.summary()

## compile model

<div class="alert alert-success">
    Compile the model with the following settings:<br>
    <ul>
        <li>loss="sparse_categorical_crossentropy"</li>
        <li>optimizer="nadam"</li>
        <li>metrics=["accuracy"]</li>
    </ul>
</div>

## train model

<div class="alert alert-success">
    Train your model on the training dataset during 10 epochs, and use the validation dataset to evaluate while training. Use a batch_size of 32 images.<br>
    <br>
    Before launching the .fit() function, create a tensorboard callback using the code below. Place it in a list, and pass it to the .fit() using the "callbacks"option.<br>
</div>

<div class="alert alert-info">
    Depending on your hardware, the training might be very slow. In order to speed-up the exercice, we will upload the weigths of a trained model. Go to the last section of this exercice, and follow the steps to restore the trained model.<br>
</div>

In [None]:
import datetime
log_dir = "log/cnn_deep/" + datetime.datetime.now().strftime("%Y%m%d-%H%M%S")
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir, histogram_freq=1)

## view training progression using Tensor Board

<div class="alert alert-success">
    Use Tensor Board in order to visualize the training progression.
</div>

### if running notebook from Google Colab

In [None]:
# Load the TensorBoard notebook extension IN GOOGLE COLAB
# NB: need to wait for first epoch to complete in order to start visualizing time series!
# IMPORTANT: click refresh button once 1 epoch achieved
%load_ext tensorboard

%tensorboard --logdir log/cnn_deep

### if running notebook from local installation
Open a terminal and type:
```
$ conda activate tf
$ cd <working dir>
$ tensorboard --logdir log/cnn_deep    # set directory used to store logs
```
Open a web-browser at the address:
```
http://localhost:6006/
```

In [None]:
# Load the TensorBoard notebook extension IN GOOGLE COLAB
# NB: need to wait for first epoch to complete in order to start visualizing time series!
# IMPORTANT: click refresh button once 1 epoch achieved
%load_ext tensorboard

%tensorboard --logdir log/cnn_deep

## evaluate model

<div class="alert alert-success">
    Evaluate the model on the test dataset. What is the score?<br>
</div>

## predict

<div class="alert alert-success">
    Take 10 images from the test dataset (pretending we have new images!), and predict the class. 
</div>

## save/upload trained model

https://www.tensorflow.org/tutorials/keras/save_and_load

In [None]:
# --- save trained model (weights)
# Use the following line of code to save the weights once you are happy with your trained model.

model.save_weights('./checkpoints/my_checkpoint') # directory will be automatically created

In [None]:
# --- restore model
# Use the following lines of code to restore a trained model from the saved weights.

# - Create a new model instance
# NB: ideally you should create a function to avoid re-writting it the model!
model = tf.keras.models.Sequential([
    tf.keras.layers.Conv2D(64, 7, activation="relu", padding="same", input_shape=[28, 28, 1]),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Conv2D(128, 3, activation="relu", padding="same"),
    tf.keras.layers.Conv2D(128, 3, activation="relu", padding="same"),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Conv2D(256, 3, activation="relu", padding="same"),
    tf.keras.layers.Conv2D(256, 3, activation="relu", padding="same"),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation="relu"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(64, activation="relu"),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(10, activation="softmax")
])

# - Restore the weights
model.load_weights('./checkpoints/my_checkpoint')

# - Compile
model.compile(loss="sparse_categorical_crossentropy", optimizer="nadam", metrics=["accuracy"])

# Transfer Learning: fine-tune a pre-trained CNN

<div class="alert alert-warning">
    Desining and training your own network can be difficult (or impossible if you do not have enough data).<br>
    <br>
    It has therefore become common practice to do "<u>transfer learning</u>": reuse the lower layers of a pretrained model, and fine-tune the upper layers of a model designed to achieve your task.<br>
    <br>
    In this exercice, you will train a model to classify pictures of flowers, reusing the pretrained <u>Xception model</u>. This exercice is a guided "copy-paste" rather than actual programming!
</div>

## install tensorflow-datasets (tfds)

<div class="alert alert-success">
    The TensorFlow DataSets (TFDS) project makes it very easy to download common datasets of various types (see complete list <a href="https://www.tensorflow.org/datasets/catalog/overview#all_datasets" target="_blank">here</a>).<br>
    <br>
    TFDS is not bundled with TensorFlow, so you need to install the <u>tensorflow-datasets</u> library. If it is not yet installed in your conda environment, follow the steps below:
</div>

```
$ conda activate tf
$ conda install -c anaconda tensorflow-datasets
```

## load data

<div class="alert alert-success">
    Load the dataset "tf_flowers" from the tensorflow_datasets using the code below. 
</div>

In [None]:
# Download the dataset
import tensorflow_datasets as tfds
dataset, info = tfds.load("tf_flowers", as_supervised=True, with_info=True)

In [None]:
dataset_size = info.splits["train"].num_examples
class_names = info.features["label"].names
n_classes = info.features["label"].num_classes

In [None]:
print(info.splits)
print(dataset_size)
print(class_names)

## split data

<div class="alert alert-success">
    Use the TFDS library to split the dataset into "test" (first 10%), "validate" (10-25%) and "test" (remaining 75%).
</div>

In [None]:
(test_set_raw, valid_set_raw, train_set_raw), metadata = tfds.load(
    'tf_flowers',
    split=[
       tfds.Split.TRAIN.subsplit(tfds.percent[:10]),
       tfds.Split.TRAIN.subsplit(tfds.percent[10:25]),
       tfds.Split.TRAIN.subsplit(tfds.percent[25:])
    ],
    with_info=True,
    as_supervised=True,
)

**Visualize the raw dataset**

In [None]:
plt.figure(figsize=(12, 10))
index = 0
for image, label in train_set_raw.take(9):
    index += 1
    plt.subplot(3, 3, index)
    plt.imshow(image)
    plt.title("Class: {}".format(class_names[label]))
    plt.axis("off")

plt.show()

## preprocess data

<div class="alert alert-success">
    The preprained CNN we want to use is the <u>Xception</u> model. In order to use it, we need to preprocess our images as the CNN expects them:<br>
    <ul>
        <li>Xception expects 224x224 images</li>
        <li>Xception expects images parsed through it's preprocess_input() function</li>
    </ul>
    <br>
    We can therefore define the following preprocess() function to fill these two expectations. Next, apply this preprocessing function to all three datasets, shuffle the training set, and add batching and prefetching to all the datasets.
</div>

In [None]:
def preprocess(image, label):
    resized_image = tf.image.resize(image, [224, 224])
    final_image = tf.keras.applications.xception.preprocess_input(resized_image)
    return final_image, label

In [None]:
batch_size = 32
train_set = train_set_raw.shuffle(1000)
train_set = train_set_raw.map(preprocess).batch(batch_size).prefetch(1)
valid_set = valid_set_raw.map(preprocess).batch(batch_size).prefetch(1)
test_set = test_set_raw.map(preprocess).batch(batch_size).prefetch(1)

In [None]:
type(train_set_raw)

## data augmentation (optional) 

<div class="alert alert-success">
    In case your dataset is very limited, you can perform some data augmentation to in order to virtually increase the amount of data for training.<br>
    Below are some examples of data augmentation techniques, using image cropping and flipping.
</div>

In [None]:
# Author: Aurélien Géron

from functools import partial

def central_crop(image):
    shape = tf.shape(image)
    min_dim = tf.reduce_min([shape[0], shape[1]])
    top_crop = (shape[0] - min_dim) // 4
    bottom_crop = shape[0] - top_crop
    left_crop = (shape[1] - min_dim) // 4
    right_crop = shape[1] - left_crop
    return image[top_crop:bottom_crop, left_crop:right_crop]

def random_crop(image):
    shape = tf.shape(image)
    min_dim = tf.reduce_min([shape[0], shape[1]]) * 90 // 100
    return tf.image.random_crop(image, [min_dim, min_dim, 3])

def preprocess(image, label, randomize=False):
    if randomize:
        cropped_image = random_crop(image)
        cropped_image = tf.image.random_flip_left_right(cropped_image)
    else:
        cropped_image = central_crop(image)
    resized_image = tf.image.resize(cropped_image, [224, 224])
    final_image = tf.keras.applications.xception.preprocess_input(resized_image)
    return final_image, label

batch_size = 32
train_set = train_set_raw.shuffle(1000).repeat()
train_set = train_set.map(partial(preprocess, randomize=True)).batch(batch_size).prefetch(1)
valid_set = valid_set_raw.map(preprocess).batch(batch_size).prefetch(1)
test_set = test_set_raw.map(preprocess).batch(batch_size).prefetch(1)

NB: these basic data augmentation functions are implemented in keras!

## load pretrained CNN

<div class="alert alert-success">
    You can download the Xception model, which we will use as the base of our model.<br>
    <br>
    Use the following settings:
    <ul>
        <li>weights="imagenet": this will download the weigths learned while training on the ImageNet dataset</li>
        <li>include_top=False: this will exclude the top of the network, i.e. the <i>global average pooling layer</i> and the <i>dense output layer</i> of the model</li>
    </ul>
</div>

In [None]:
# - load Xception model and define as base model
base_model = tf.keras.applications.xception.Xception(weights="imagenet", include_top=False)

## add layers

<div class="alert alert-success">
    We now need to add layers to our model, which we need to actually perform on our own cassification task:<br>
    <ul>
        <li>a <i>global average pooling layer</i>, based on the output of the base model</li>
        <li>a <i>dense output layer</i> with one unit per class, using the softmax activation function</li>
    </ul>
</div>

NB: include_top=False removes in case we did not want a Global Average Pooling layer; it turns out we want it, so we add it.

In [None]:
# - add layers for our training
avg = tf.keras.layers.GlobalAveragePooling2D()(base_model.output) # takes base_model ouputs as input
output = tf.keras.layers.Dense(n_classes, activation="softmax")(avg) # takes GlobalAveragePooling2D layer as input

# - create final model
model = tf.keras.Model(inputs=base_model.input, outputs=output)

We need to freeze the weights of the pre-trained layers at the beginning of training:

In [None]:
for layer in base_model.layers: #OOP !
    layer.trainable = False

## compile and train model

In [None]:
# - compile
optimizer = tf.keras.optimizers.SGD(lr=0.2, momentum=0.9, decay=0.01)
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])

In [None]:
# - train
history = model.fit(train_set, epochs=5, validation_data=valid_set)

## fine-tuning: unfreeze all layers and continue training
After training the model for a few epochs, the validation accuracy should reach about 75–80% and stop making much progress.<br>
<br>
This means that the top layers are now pretty well trained, so we are ready to unfreeze all the layers (or just the top ones) and continue training: we are now <u>FINE TUNING</u> the pretrained weigths.<br>
<br>
Important: In order to avoid damaging what the base layers have learned on the ImageNet dataset (i.e. pretrained weights), we need to use a much lower learning rate.

In [None]:
# - unfreeze base layers
for layer in base_model.layers: #NB: could iterate on only a part of these layers
    layer.trainable = True

# - set much lower learning rate to avoid damaging the pretrained weights
optimizer = tf.keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.001)

In [None]:
# - recompile and continue training
model.compile(loss="sparse_categorical_crossentropy", optimizer=optimizer, metrics=["accuracy"])
history = model.fit(train_set, epochs=5, validation_data=valid_set)

# MLP (MNIST fashion dataset): Tensor Board and python script

<div class="alert alert-warning">
    During the previous lecture (Deep Learning 1), you designed/trained a <b>Multi Layered Perceptron (MLP)</b> network (i.e. a fully connected artificial neural network ANN) to classify the MNIST fashion dataset.<br>
    <br>
    This exercise is meant to show you how you can use a Python script to design the network (often more convenient than Jupyter notebooks!), and visualize the training using Tensor Board interface.
    <br>
    <br>
    <u>Note</u>: in the previous exercise we implemented a fairly deep network with several convolutional layers. The Python script "tf_mnist-fashion_cnn.py" implements a shallower CNN, in order to have a better comparison of the performance one can achieve with a MLP and CNN (i.e. compare accuracy, number of parameters, etc.).
</div>

## run the python script

<div class="alert alert-success">
    Open a terminal, activate your conda environment, and launch the Python script "tf_mnist-fashion_mlp.py" as follows:
</div>

```
$ conda activate tf
$ cd <working dir>
$ python tf_mnist-fashion_mlp.py
```

## view training progression using Tensor Board

<div class="alert alert-success">
    Use Tensor Board in order to visualize the training progression.<br>
    <br>
    To do so, follow the steps described below:
</div>

```
Open terminal launch TensorBoard from your working environment:
$ conda activate tf
$ cd <working dir>
$ tensorboard --logdir log/mlp    # set directory used to store logs, as defined in the Python script

Open a web-browser at the address:
http://localhost:6006/
```