From: Shapley-based interpretation of deep learning models for wildfire spread rate prediction
Sr # | Hyperparameters | Possible values |
---|---|---|
1 | Number of layers (encoder) | 4 layers |
2 | Model dimension (d_model) | 512 |
3 | Number of heads | 8 heads |
4 | Feed-forward dimension | 2048 |
5 | Activation function | ReLU |
6 | Dropout rate | 0.1 |
7 | Attention mechanism | Scaled dot-product attention |
8 | Learning rate | 0.0001 |
9 | Learning rate scheduler | Cosine annealing |
10 | Weight initialization | Xavier initialization |
11 | Batch size | 32 |
12 | Number of epochs | 100 |
13 | Optimization algorithm | Adam |
14 | Warm-up steps | 4000 |
15 | Regularization (L2 penalty) | 0.0001 |