Merge branch 'poutyne'

FrancescoSaverioZuppichini · FrancescoSaverioZuppichini · commit 1c155239cd7d · 2020-01-31T10:08:48.000+01:00
diff --git a/README.md b/README.md
@@ -57,7 +57,7 @@ Every deep learning project has at least three mains steps:
 ## Project
 One good idea is to store all the paths at an interesting location, e.g. the dataset folder, in a shared class that be accessed by anyone in the folder. You should never hardcode any paths and always define them once and import them. So, if you later change your structure you will only have to modify one file.
 If we have a look at `Project.py` we can see how we defined the `data_dir` and the `checkpoint_dir` once for all. We are using the 'new' [Path](https://docs.python.org/3/library/pathlib.html) APIs that support different OS out of the box, and also make it easier to join and concatenate paths.
-![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/develop/images/Project.png)
+![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/master/images/Project.png)
 For example, if we want to know the data location we can :
 ```python3
 from Project import Project
@@ -70,13 +70,16 @@ In our example, we directly used `ImageDataset` from `torchvision` but we includ
 ### Transformation
 You usually have to do some preprocessing on the data, e.g. resize the images and apply data augmentation. All your transformation should go inside `.data.trasformation`. In our template, we included a wrapper for
 [imgaug](https://imgaug.readthedocs.io/en/latest/)
-![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/develop/images/transformation.png)
+![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/master/images/transformation.png)
 ### Dataloaders
 As you know, you have to create a `Dataloader` to feed your data into the model. In the `data.__init__.py` file we expose a very simple function `get_dataloaders` to automatically configure the *train, val and test* data loaders using few parameters
-![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/develop/images/data.png)
+![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/master/images/data.png)
 ## Losses
 Sometimes you may need to define your custom losses, you can include them in the `./losses` package. For example
-![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/develop/images/losses.png)
+![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/master/images/losses.png)
+## Metrics
+Sometimes you may need to define your custom metrics. For example
+![alt](https://raw.githubusercontent.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/master/images/metrics.png)
 ## Logging 
 We included python [logging](https://docs.python.org/3/library/logging.html) module. You can import and use it by:
 
@@ -88,27 +91,27 @@ logger.info('print() is for noobs')
 ## Models
 All your models go inside `models`, in our case, we have a very basic cnn and we override the `resnet18` function to provide a frozen model to finetune.
 
-![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/develop/images/resnet.png?raw=true)
+![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/master/images/resnet.png?raw=true)
 ## Train/Evaluation
 In our case we kept things simple, all the training and evaluation logic is inside `.main.py` where we used [poutyne](https://pypi.org/project/Poutyne/) as the main library. We already defined a useful list of callbacks:
 - learning rate scheduler
 - auto-save of the best model
 - early stopping
 Usually, this is all you need!
-![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/develop/images/main.png?raw=true)
+![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/master/images/main.png?raw=true)
 ### Callbacks 
 You may need to create custom callbacks, with [poutyne](https://pypi.org/project/Poutyne/) is very easy since it support Keras-like API. You custom callbacks should go inside `./callbacks`. For example, we have created one to update Comet every epoch.
-![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/develop/images/CometCallback.png?raw=true)
+![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/master/images/CometCallback.png?raw=true)
 
 ### Track your experiment
 We are using [comet](https://www.comet.ml/) to automatically track our models' results. This is what comet's board looks like after a few models run.
-![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/develop/images/comet.jpg?raw=true)
+![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/master/images/comet.jpg?raw=true)
 Running `main.py` produces the following output:
-![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/develop/images/output.jpg?raw=true)
+![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/master/images/output.jpg?raw=true)
 ## Utils
 We also created different utilities function to plot booth dataset and dataloader. They are in `utils.py`. For example, calling `show_dl` on our train and val dataset produces the following outputs.
-![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/develop/images/Figure_1.png?raw=true)
-![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/develop/images/Figure_2.png?raw=true)
+![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/master/images/Figure_1.png?raw=true)
+![alt](https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Skeletron/blob/master/images/Figure_2.png?raw=true)
 As you can see data-augmentation is correctly applied on the train set
 ## Conclusions
 I hope you found some useful information and hopefully it this template will help you on your next amazing project :)
diff --git a/data/__init__.py b/data/__init__.py
@@ -10,7 +10,8 @@ def get_dataloaders(
         train_transform=None,
         val_transform=None,
         split=(0.5, 0.5),
-        batch_size=32):
+        batch_size=32,
+        *args, **kwargs):
     """
     This function returns the train, val and test dataloaders.
     """
@@ -27,8 +28,8 @@ def get_dataloaders(
     val_ds, test_ds = random_split(val_ds, lengths.tolist())
     logging.info(f'Train samples={len(train_ds)}, Validation samples={len(val_ds)}, Test samples={len(test_ds)}')
 
-    train_dl = DataLoader(train_ds, batch_size=batch_size, shuffle=True, pin_memory=True, num_workers=4)
-    val_dl = DataLoader(val_ds, batch_size=batch_size, shuffle=False, pin_memory=True, num_workers=4)
-    test_dl = DataLoader(test_ds, batch_size=batch_size, shuffle=False, pin_memory=True, num_workers=4)
+    train_dl = DataLoader(train_ds, batch_sAize=batch_size, shuffle=True, *args, **kwargs)
+    val_dl = DataLoader(val_ds, batch_size=batch_size, shuffle=False, *args, **kwargs)
+    test_dl = DataLoader(test_ds, batch_size=batch_size, shuffle=False, *args, **kwargs)
 
     return train_dl, val_dl, test_dl
diff --git a/images/data.png b/images/data.png
diff --git a/images/main.png b/images/main.png
diff --git a/images/metrics.png b/images/metrics.png
diff --git a/losses/__init__.py b/losses/__init__.py
@@ -1,6 +1,6 @@
 import torch
-# define custom losses
 
+# define custom losses
 def my_loss(output, target):
     loss = torch.mean((output - target) ** 2)
     return loss
diff --git a/main.py b/main.py
@@ -12,14 +12,60 @@
 from callbacks import CometCallback
 from logger import logging
 
-project = Project()
-# our hyperparameters
-params = {
-    'lr': 0.001,
-    'batch_size': 32,
-    'model': 'resnet18-finetune'
-}
-logging.info(f'Using device={device} 🚀')
+if __name__ == '__main__':
+    project = Project()
+    # our hyperparameters
+    params = {
+        'lr': 0.001,
+        'batch_size': 32,
+        'epochs': 10,
+        'model': 'resnet18-finetune'
+    }
+    logging.info(f'Using device={device} 🚀')
+    # everything starts with the data
+    train_dl, val_dl, test_dl = get_dataloaders(
+        project.data_dir / "train",
+        project.data_dir / "val",
+        val_transform=val_transform,
+        train_transform=train_transform,
+        batch_size=params['batch_size'],
+        pin_memory=True,
+        num_workers=4,
+    )
+    # is always good practice to visualise some of the train and val images to be sure data-aug
+    # is applied properly
+    show_dl(train_dl)
+    show_dl(test_dl)
+    # define our comet experiment
+    experiment = Experiment(api_key="YOU_KEY",
+                            project_name="dl-pytorch-template", workspace="francescosaveriozuppichini")
+    experiment.log_parameters(params)
+    # create our special resnet18
+    cnn = resnet18(2).to(device)
+    # print the model summary to show useful information
+    logging.info(summary(cnn, (3, 224, 244)))
+    # define custom optimizer and instantiace the trainer `Model`
+    optimizer = optim.Adam(cnn.parameters(), lr=params['lr'])
+    model = Model(cnn, optimizer, "cross_entropy",
+                  batch_metrics=["accuracy"]).to(device)
+    # usually you want to reduce the lr on plateau and store the best model
+    callbacks = [
+        ReduceLROnPlateau(monitor="val_acc", patience=5, verbose=True),
+        ModelCheckpoint(str(project.checkpoint_dir /
+                            f"{time.time()}-model.pt"), save_best_only="True", verbose=True),
+        EarlyStopping(monitor="val_acc", patience=10, mode='max'),
+        CometCallback(experiment)
+    ]
+    model.fit_generator(
+        train_dl,
+        val_dl,
+        epochs=params['epochs'],
+        callbacks=callbacks,
+    )
+    # get the results on the test set
+    loss, test_acc = model.evaluate_generator(test_dl)
+    logging.info(f'test_acc=({test_acc})')
+    experiment.log_metric('test_acc', test_acc)
 
 if __name__ == '__main__':
     # everything starts with the data
diff --git a/metrics/__init__.py b/metrics/__init__.py
@@ -0,0 +1,22 @@
+from poutyne.framework.metrics import EpochMetric
+
+# define a custom metric as a function
+def my_metric(y_true, y_pred):
+    pass
+
+# or as a class when we need to accumulate
+class MyEpochMetric(EpochMetric):
+    def forward(self, y_pred, y_true):
+        """
+        To define the behavior of the metric when called.
+        Args:
+            y_pred: The prediction of the model.
+            y_true: Target to evaluate the model.
+        """
+        pass
+
+    def get_metric(self):
+        """
+        Compute and return the metric.S
+        """
+        pass
diff --git a/secrets.json b/secrets.json
@@ -0,0 +1,3 @@
+{
+    "COMET_API_KEY" : "8THqoAxomFyzBgzkStlY95MOf"
+}

Original file line number	Diff line number	Diff line change
`@@ -0,0 +1,3 @@`
	`1`	`+{`
	`2`	`+ "COMET_API_KEY" : "8THqoAxomFyzBgzkStlY95MOf"`
	`3`	`+}`