Skip to main content

Table 7 List of best hyperparameters for transformer model

From: Shapley-based interpretation of deep learning models for wildfire spread rate prediction

Sr #

Hyperparameters

Possible values

1

Number of layers (encoder)

4 layers

2

Model dimension (d_model)

512

3

Number of heads

8 heads

4

Feed-forward dimension

2048

5

Activation function

ReLU

6

Dropout rate

0.1

7

Attention mechanism

Scaled dot-product attention

8

Learning rate

0.0001

9

Learning rate scheduler

Cosine annealing

10

Weight initialization

Xavier initialization

11

Batch size

32

12

Number of epochs

100

13

Optimization algorithm

Adam

14

Warm-up steps

4000

15

Regularization (L2 penalty)

0.0001