#Deeplodocus
##Introduction: Initially created by both A. Leroy and S. Westlake, PhD candidates at Cranfield University, Deeplodocus is a high-level framework with the objective to help to create, train, analyze and test deep networks. Initially based on Keras, the framework is rewritten from scratch in a modular, flexible and config file based fashion. The current supported backed is PyTorch but you are more than welcome to contribute to integrate another library.
Deeplodocus embeds functionalities such as :
- Smart and automatic data loading
- Data augmentation / transformation
- Metrics and losses display during training, validation and test stages
- Save of the results (model, weights and metrics)
- Custom loss and metrics
- Explicit logs system
- Training configuration from config files
- And more !
All the Deeplodocus parameters are available in the config files.
Deeplodocus is designed to be crossplatform (Windows, Linux and Mac OS). It uses Python 3.5+.
The current version of Deeplodocus relies on PyTorch as backend. Please install the adequate version of PyTorch available HERE.
Once PyTorch is installed, you can install Deeplodocus using a PIP command :
pip install deeplodocus
- PyTorch (v1.0.0+)
- numpy (v1.15.1+)
- pyyaml (3.13+)
- pandas (0.23.1+)
- matplotlib (2.2.2+)
- aiohttp (3.4.0+)
- psutil (5-4.8+)
- cv2 (3.1.4+) (For fast image transformation)
- User friendly notifications [✓]
- User interface on browser [Under construction]
- Logs [✓]
- Automatic trainer [✓]
- Automatic tester/validator [✓]
- Callbacks [✓]
- History [✓]
- Saving the model [✓]
- Smart saving [✓]
- Dataset Loader [✓]
- Data checker [Under construction]
- Data Augmentation
- Transformations [✓]
- Filters
- Augmentation modes [Under construction]
There are two solutions to check Deeplodocus version:
-
Look at the version displayed into Deeplodocus logo when starting the
Brainof your project -
Run the following command in the terminal :
deeplodocus version
Open a terminal / command line and enter :
deeplodocus startproject "project-name"
replacing "project-name" by the name of your project
This command will generate the following structure in the current folder :
- Main.py (Main file)
- Config folder
- Logs folder
- Results folder
- Model folder
- Losses folder
- Metrics folder
- Possible other folders to be defined
The current structure of Deeplodocus is as described by this schema :
INSERT FIGURE
Deeplodocus is composed of 3 entries:
- Deeplodocus Admin : A terminal / command line entry to create a Deeplodocus project
- Deeplodocus Brain : A main script file entry to train / test a deep network
- Deeplodocus UI : A browser access to an interface to configure and use Deeplodocus
The command line entry permits to access functions in the Deeplodocus admin system such as creating a Deeplodocus project
To access the admin commands just enter deeplodocus in the terminal followed by the required command.
Deeplodocus Brain is available through the Brain class. A brain instance gives access to all the functionalities of Deeplodocus such as training or testing the network. It is also possible to access the brain using the main file generated by Deeplodocus Admin.
Using the terminal and modifying the config files may lead to errors that will cost time in the development of a network. The Deeplodocus team develops a web interface permitting to :
- Define and generate dynamically config files
- Visualise the metrics in real time
- Visualise the generate network (graph)
- Previsualize data augmentation/transforms on the dataset
- ...
Deeplodocus comes with a notification system. The current system is composed of 5 different notification categories :
- Info (blue) : Display a generic information to the user (default)
- Success (green) : Display a successful action result
- Warning (yellow) : Display a warning
- Error (red) : Display a no-blocking error to the user
- Fatal error (white with red background) : Display a blocking error to the user. Stops the brain
- Result (white) : Display a training result
- Input (Blanking white) : Ask the user for an keyboard input
All notifications are saved into a log file in logs/notification-datetime.logs
The data in Deeplodocus is splitted into 3 different entries:
- Inputs (input given to your Machine Learning algorithm)
- Labels (Expected Output, optional)
- Additional Data (Additional data given to the loss function if required, optional)
In order to load the data you have to feed Deeplodocus using the config/data.yaml file:
data.yaml
# _____________________________
#
# ---- DATA CONFIG EXAMPLE ----
# _____________________________
#
dataloader:
batch_size: 4 # Number of instances loaded in on batch
num_workers: 4 # Number of processes or threads used for loading the data in memory
dataset:
train: # Training entries
inputs :
- ["input1-1.txt", "input1-2.txt"]
- "input2.txt"
labels :
- "labels1.txt"
- "labels2.txt"
additional_data:
- "additional.txt"
validation: # Validation entries
inputs:
- ["input1-1.txt", "input1-2.txt"]
- "input2.txt"
labels:
- "labels1.txt"
- "labels2.txt"
additional_data:
- "additional.txt"
test: # Test entries
inputs:
- ["input1-1.txt", "input1-2.txt"]
- "input2.txt"
labels:
- "labels1.txt"
- "labels2.txt"
additional_data:
- "additional.txt"Deeplodocus accepts to load data referenced in text files (images path, video path, numbers, text, numpy array path, etc...) and also files inside folder. Therefore you can directly give a file path or a folder path.
NOTE: Relative paths are relative to the main file
# Example
train:
inputs:
- "input1.txt" # Works
train:
inputs:
- "./path_to_folder/" # Works as wellIf you have multiple entries, please add the item below:
# Example
train:
inputs:
- "input1.txt" # Input 1
- "input2.txt" # Input 2If one entry is split in to different location, you can merge these to sources in one using brackets:
# Example
train:
inputs:
- ["input1-1.txt", "input1-2.txt"] # Input 1 = input1-1 + input1-2
- "input2.txt" # Input 2Data are loaded by Deeplodocus through two interfaces :
- Dataset
- Dataloader
The Dataset has two main objectives :
- Automatically read, check the completeness and format the data given in the config files in folders and files
- Open, augment/transform the data before being transmitted to the network
The Dataloader can call the Dataset in parallel of the training using the CPU.
The data are assembled into batches and then sent to the Trainer or the Tester
NOTE : Currently the Dataloader is provided by the PyTorch's Dataloader.
Data transformation is made using four different Transformer classes managed by a TransformManager. The following Transformer classes are available:
- Sequential
- One Of
- Some Of
- Pointer
Here is an example of config file to generate the TransformManager for the training, validation and test.
Make sure the three are given in one file.
# transform_config.yaml
# ___________________________________
#
# ---- TRANSFORM MANAGER EXAMPLE ----
# ___________________________________
#
train: # Training entries
name: "Train Transform Manager"
inputs :
- ".config/transforms/transform_input_train_left.yaml"
- "*input:0" # Example of pointer which points to first transformer of input
labels :
- Null
- Null
additional_data:
- Null
validation: # Validation entries
name: "Validation Transform Manager"
inputs :
- ".config/transforms/transform_input_validation_left.yaml"
- "*input:0" # Example of pointer which points to first transformer of input
labels :
- Null
- Null
additional_data:
- Null
test: # Test entries
name: "Validation Transform Manager"
inputs:
- Null
- Null
labels :
- Null
- Null
additional_data:
- NullEach entry has to be given a Transformer. If none is desired, please enter Null.
The Sequential transformer applies transformation operations sequentially on the given input.
Here is an example of config file for a Sequential transformer:
# sequential_example.yaml
# ______________________________________
#
# ---- TRANSFORM SEQUENTIAL EXAMPLE ----
# ______________________________________
#
method: "sequential"
name: "Sequential example"
transforms:
- blur :
kernel_size : 3
- random_blur:
kernel_size_min: 15
kernel_size_max : 21For more details on each transform please check the corresponding documentation
The Oneof transformer applies on the given input one transformation operation among the ones available in the given list.
Here is an example of config file for a OneOf transformer:
# oneof_example.yaml
# _________________________________
#
# ---- TRANSFORM ONEOF EXAMPLE ----
# _________________________________
#
method: "oneof"
name: "OneOf example"
transforms:
- blur :
kernel_size : 3
- random_blur:
kernel_size_min: 15
kernel_size_max : 21
For more details on each transform please check the corresponding documentation
The Someof transformer applies multiple transformation operations to the given input.
The number of operations applied can be fixed if the user specifies num_transformations.
The number of transformation can also be a random number between num_transformations_min and num_transformations_max.
Here is an example of config file for a SomeOf transformer:
# someof_example.yaml
#___________________________________
#
# ---- TRANSFORM SOMEOF EXAMPLE ----
# __________________________________
#
method: "someof"
name: "SomeOf example"
num_transformations : 3 #If used, comment "num_transformations_min" and "num_transformations_max"
#num_transformations_min : 1 #If used, comment "num_transformations"
#num_transformations_max : 5 #If used, comment "num_transformations"
transforms:
- blur :
kernel_size : 3
- random_blur:
kernel_size_min: 15
kernel_size_max : 21
For more details on each transform please check the corresponding documentation
The Pointer consists in redirecting the transformation process to another transformer. Using a pointer has two major advantages :
- It avoids creating another transformer config file
- It allows multiple entries system to have exactly the same transformation operations applied to all its entries. [1]
[1] - e.g. : Working on a stereo vision system. The input of the network consists in two images (left and right). If we create to transformers (one for the left image and one for the right image) then the left and right image will not have the same transformations applied (if random operations are used). Using a pointer for the right image to the left transformer will allow to have exactly the same transformations applied to both the right and left image (even if random operations and done (e.g. random blur)
To define a Pointer, instead of the path to the transformer config file, enter
*category:index# * Is necessary so that Deeplodocus understands it is a pointer
# category: "input", "label", "additional_data"
# index: The index of the transformer in the selected category (indexed at 0)
#e.g. : "*input:0" points to the first transformer of the inputNote : A Pointer cannot point to another Pointer
Offline data augmentation is not available
Not only Deeplodocus provides a rich list of standard builtin transformation operations, it also allows you to define your own transforms !
The transforms can either be fixed operations or random operations. The following example illustrates two different transforms (blurs). The first blur method directly uses the given kernel size whereas the second one computes a blur whose the kernel size is picked between its min and max parameters.
import random
import numpy as np
import cv2
def blur(image: np.array, kernel_size: int):
kernel = (int(kernel_size), int(kernel_size))
image = cv2.blur(image, kernel)
# No need to return the info on the current transform operation because there isn't any random parameter here.
# Make sure to replace the info of the current transform by None
return image, None
def random_blur(image: np.array, kernel_size_min: int, kernel_size_max: int):
kernel_size = (random.randint(kernel_size_min // 2, kernel_size_max // 2)) * 2 + 1
image, _ = blur(image, kernel_size)
# Return the info on the last operation used (Required for the Pointer !)
# ["Name", method, dictionary of arguments required]
transform = ["blur", blur, {"kernel_size": kernel_size}]
# Return the result of the transform and the info of the current transform operation
return image, transformThe custom transforms can be saved inside the config/modules/transforms folder in any python file. This architecture allows your to structure your files as you wish.