-
Notifications
You must be signed in to change notification settings - Fork 2
Transfer Learning 101
It is meant to get you started with two tools: a functional retraining script and Tensorboard to observe what that training is doing.
Follow these steps... or tinker as you will and ask away. Up to you!
-
You have cloned (or downloaded and unzipped) the code in this repo. Notice the root folder will be TransferLearning (i.e. the parent of tf_files).
-
You have downloaded a compressed Mobilenets model into your $HOME/.keras/models directory, which is where tk.keras (Tensorflow's high-level API & framework will look for it). We can work with these models:
- a very small one, for image size 128, if you don't have a GPU and your CPU is not an i7;
- the highest accuracy one: this one.
-
You have downloaded a new dataset: flowers
-
You have extracted the tar file into tf_files/, so now you have the following "one-folder-per-class" structure:
tf_files/
split_flowers/
train/
daisy/
image97999.jpg
...
dandelion/
...
val/
daisy/
image324.jpg
...
-
Launch a first retraining with:
sh/retrain.sh -
Start up Tensorboard from another terminal with:
tensorboard --logdir=tf_files/training_summaries -
Open your browser and go to http://localhost:6006 and you'll see what your training looks like. We'll go over the interesting bits together. Keep it open for the rest of the workshop.
Sanity check: does your new model recognize images that were in its own training set?
python -m scripts.label_image \
--labels=tf_files/models/retrained/retrained_labels.npy \
--model_root_dir="tf_files/models/retrained" \
--image=tf_files/flower_photos/daisy/21652746_cc379e0eea_m.jpgYou should see:
tulips 0.997504
roses 0.00142864
dandelion 0.000997444
daisy 6.52707e-05
sunflowers 4.72481e-06
Better yet, let's check you didn't overfit on your training set: search online for new images and try your new model on them.
python -m scripts.label_image \
--model_root_dir="tf_files/models/retrained" \
--labels=tf_files/models/retrained/retrained_labels.npy \
--image=path_to_new_imageTry to break it: what happens with a plush flower? Does it still count? There are a few edge cases in tf_files/unseen_data to get you started - do look for your own!
Let's mess with our model with this hideous thing:

python -m scripts.label_image \
--model_root_dir="tf_files/models/retrained" \
--labels=tf_files/models/retrained/retrained_labels.npy \
--image=tf_files/unseen_data/plastic_kitsch_daisy_thing.jpg
roses 0.8985332
tulips 0.04613225
daisy 0.037395664
sunflowers 0.010101912
dandelion 0.0078369435
Boo, that was not a rose. More like a cologne bottle with a... flower thing. But that wasn't in our classes and Transfer Learning isn't magic.
Change parameters (including the name of your output) in sh/retrain.sh.
Observe your progress on Tensorboard, which usually you can see with your browser in http://localhost:6006
You will see how it performs on unseen data after every training (look for the accuracy of the re-trained model).
- Change one parameter at a time, so you have a baseline.
- Perhaps name your model with each experiment, so you'll be able to find it in Tensorboard.
Open sh/retrain.sh You should see several parameters you can pass the retraining script.
# 224, 192, 160 or 128
IMAGE_SIZE=224
# 1.0, 0.75, 0.50 or 0.25
WIDTH=1.0
# adam or sgd
OPTIMIZER="sgd"
LEARNING_RATE=0.01
BATCH_SIZE=128
TEST_PERC=5
STEPS=500
LABEL="baseline"#"bs_$BATCH_SIZE-lr_$LEARNING_RATE-opt_$OPTIMIZER"
A sensible value is 400. See what your training and validation loss looks like after 10, 200 and 1000 steps.
You will see that good learning rate values also depend on the optimizer you choose. For the default one (SGD), good values lie between 0.001 and 0.005: but try out much bigger (and much smaller) values to see how it affects training!
Try using 10: you should see a wildly oscillating loss curve.
A huge size in a huge network would make you run out of GPU memory. But see how very small values (8) makes the loss function bounces when you reduce its value to the absurd.
Make your network overfit like crazy by cranking TEST_PERCENTAGE up to 90%. That means you're training over... the leftovers.
Try an extra-wide network (for Mobilenets standards): WIDTH=1.0 is the bestest. Combine it with TEST_PERCENTAGE to see how training and validation accuracies change when you train with more or less data.
The bigger, the better. We have a choice of 4 resolutions for Mobilenets: 224, 192, 160, 128
Have fun... and don't forget to check what the loss function ("Cross-entropy") looks like and how validation and training loss compare for each experiment!