1from labml.configs import BaseConfigsThis defines configurations for a transformer. The configurations are calculate using option functions. These are lazy loaded and therefore only the necessary modules are calculated.
4class RWKVConfigs(BaseConfigs):Number of attention heads
14 n_heads: int = 8Transformer embedding size
16 d_model: int = 512Number of layers
18 n_layers: int = 6Dropout probability
20 dropout: float = 0.1Number of tokens in the source vocabulary (for token embeddings)
22 n_src_vocab: intNumber of tokens in the target vocabulary (to generate logits for prediction)
24 n_tgt_vocab: int