Heterogeneous GNN Driven Information Prediction Framework
Our HIF is implemented mainly based on the following libraries (see the README file in source code folder for more details):
- PyTorch: https://pytorch.org/
- PyTorch Lightning: https://www.pytorchlightning.ai/
- DGL: https://www.dgl.ai/
- PytorchNLP: https://github.com/PetrochukM/PyTorch-NLP
- Networkx: https://networkx.org/
Our main project file structure and description:
HIF
├─ README.md
├─ environment.yaml # our environment list
├─ data
│ ├─ processed # processed data will be made by program
│ ├─ raw # our raw data
├─ models # package of models (HIF and its variants)
│ ├─ __init__.py
│ ├─ classfier.py
│ ├─ hetero.py
│ ├─ predictor.py
│ ├─ test.py
├─ nn # package of neural network layers
│ ├─ coattention
│ ├─ conv
│ ├─ functional
│ ├─ metrics
│ ├─ nlp
│ ├─ tcn
│ ├─ time
│ ├─ __init__.py
│ ├─ conv.py # graph convolution layer
│ ├─ embedding.py # embedding module
│ ├─ noise.py
│ ├─ positional_encoding.py
│ ├─ reinforcements.py
│ └─ readout.py # output layer
├─ run.py # running entrance
└─ utils # utils of drawing, dataloading, tensor and parameter propcessing
├─ data
├─ draw
├─ graph_op
├─ __init__.py
├─ additive_dict.py
├─ ckpt.py
├─ clean_lightning_logs.py
├─ group_tensorborad_data.py
├─ indexer.pyInstallation requirements are described in environment.txt
-
Create Conda Environment:
Open a terminal or command prompt and run the following command:
conda env create -f environment.yml
-
Activate Conda Environment:
After the installation is complete, activate the newly created Conda environment using the following command:
conda activate <environment_name>
-
Verify Environment Installation:
You can verify whether the environment is correctly installed:
conda list
Before it starts running, please make sure that HIF is located in the /root/hif directory
Then, get helps for all parameters of data processing, training and optimization:
python run.py --helpRun:
python run.py --paramater valueHere we list some basic settings of parameters for your reference:
Basic settings
| Parameter | Value | Description |
|---|---|---|
| gpus | 1 | Number of GPUs to train on (int) or which GPUs to train on (list or str) applied per node. |
| time_loss_weig | 5e-5 | Wights of time nodes similarity. |
| num_workers | 4 | Number of workers. |
| num_heads | 16 | Number of attention heads. |
| log_every_n_steps | 5 | How often to log within steps. |
| batch_size | 64 | Batch Size. |
| name | SSC | Datasets name. |
| observation | 1.0 | Observation time. |
| sample | 0.05 | Sample. |
| accumulate_grad_batches | 1 | Accumulates grads every k batches or as set up in the dict. |
| in_feats | 256 | Dimension of inputs. |
| learning_rate | 5e-3 | Learning rate to optimize parameters. |
| weight_decay | 5e-3 | Weight decay. |
| l1_weight | 0 | L1 loss. |
| patience | 35 | Patience of early stopping. |
| num_time_nodes | 6 | Number of time nodes. |
| soft_partition | 2 | Time node soft partition size. |
| num_gcn_layers | 2 | The number of gcn layers. |
| readout | ml | The readout module. |
| num_readout_layers | 3 | The number of internal layers of readout module. |
| time_decay_pos | all | Position of time decay. |
| time_module | transformer | Time embedding module. |
| num_time_module_layers | 4 | The number of layers of time module, such as rnn layers or transformer encoder layers. |
| dropout | 0.3 | Dropout. |
| dropout_edge | 0 | Dropout edges. |
| noise_weight | 0 | Weight of noise. |
| noise_rate | 1 | Coverage rate of noise. |
| noise_dim | 1 | Dimension of noise. |
| hop | 1 | Hops of sampled follower. |
| alpha | 10 | Weight to adjust sampled follower |
| beta | 100 | Minimum number of sampled follower. |
| source_base | 4 | Weight of source nodes. |
| reposted_base | 5 | Weight of reposted nodes. |
| leaf_base | 6 | Weight of leaf nodes. |
| method | bfs | Method to sample follower. |
| gradient_clip_val | 1 | The value at which to clip gradients. |
| max_epochs | 300 | Stop training once this number of epochs is reached. |
| enable_progress_bar | False | Whether to enable to progress bar by default. |
| enable_model_summar | False | Whether to enable model summarization by default. |