Transformer
rydberggpt.models.transformer
¶
layers
¶
DecoderLayer
¶
Bases: Module
Decoder is made of self-attn, src-attn, and feed forward.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
int
|
The input size. (d_model) |
required |
self_attn |
MultiheadAttention
|
The self-attention module. |
required |
src_attn |
MultiheadAttention
|
The source-attention module. |
required |
feed_forward |
PositionwiseFeedForward
|
The feed forward module. |
required |
dropout |
float
|
The dropout rate. |
required |
Source code in src/rydberggpt/models/transformer/layers.py
forward(x: torch.Tensor, memory: torch.Tensor, batch_mask: torch.Tensor) -> torch.Tensor
¶
Compute the forward pass through the decoder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The input tensor. |
required |
memory |
Tensor
|
The memory tensor. |
required |
batch_mask |
Tensor
|
The mask tensor for batches. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor. |
Source code in src/rydberggpt/models/transformer/layers.py
EncoderLayer
¶
Bases: Module
Encoder is made up of self-attn and feed forward.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
int
|
The input size. (d_model) |
required |
self_attn |
MultiheadAttention
|
The self-attention module. |
required |
feed_forward |
PositionwiseFeedForward
|
The feed forward module. |
required |
dropout |
float
|
The dropout rate. |
required |
Source code in src/rydberggpt/models/transformer/layers.py
forward(x: torch.Tensor, batch_mask: torch.Tensor) -> torch.Tensor
¶
Compute the forward pass through the encoder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The input tensor. |
required |
batch_mask |
Tensor
|
The mask tensor for batches. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor. |
Source code in src/rydberggpt/models/transformer/layers.py
models
¶
Decoder
¶
Bases: Module
The core of the transformer, which consists of a stack of decoder layers.
Source code in src/rydberggpt/models/transformer/models.py
__init__(layer: nn.Module, n_layers: int)
¶
Initialize the Decoder class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
layer |
Module
|
A single instance of the decoder layer to be cloned. |
required |
n_layers |
int
|
The number of decoder layers in the stack. |
required |
Source code in src/rydberggpt/models/transformer/models.py
forward(x: torch.Tensor, memory: torch.Tensor, batch_mask: torch.Tensor) -> torch.Tensor
¶
Pass the (masked) input through all layers of the decoder.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The input tensor to the decoder of shape (batch_size, seq_length, d_model). |
required |
memory |
Tensor
|
The memory tensor, typically the output of the encoder. |
required |
batch_mask |
Tensor
|
The mask tensor for batches. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor after passing through all layers of the decoder of shape (batch_size, seq_length, d_model). |
Source code in src/rydberggpt/models/transformer/models.py
Encoder
¶
Bases: Module
The core encoder, which consists of a stack of N layers.
Source code in src/rydberggpt/models/transformer/models.py
__init__(layer: nn.Module, N: int)
¶
Initialize the Encoder class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
layer |
Module
|
A single instance of the encoder layer to be cloned. |
required |
N |
int
|
The number of encoder layers in the stack. |
required |
Source code in src/rydberggpt/models/transformer/models.py
forward(x: torch.Tensor, batch_mask: torch.Tensor) -> torch.Tensor
¶
Pass the input through each layer in turn.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The input tensor to the encoder of shape (batch_size, seq_length, d_model). |
required |
batch_mask |
Tensor
|
The mask tensor for batches. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor after passing through all layers of the encoder, with the same shape as the input tensor (batch_size, seq_length, d_model). |
Source code in src/rydberggpt/models/transformer/models.py
EncoderDecoder
¶
Bases: LightningModule
A standard Encoder-Decoder architecture. Base for this and many other models.
Source code in src/rydberggpt/models/transformer/models.py
__init__(encoder: nn.Module, decoder: nn.Module, src_embed: nn.Module, tgt_embed: nn.Module, generator: nn.Module)
¶
Initialize the EncoderDecoder class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
encoder |
Module
|
The encoder module. |
required |
decoder |
Module
|
The decoder module. |
required |
tgt_embed |
Module
|
The target embedding module. |
required |
generator |
Module
|
The generator module. |
required |
Source code in src/rydberggpt/models/transformer/models.py
decode(tgt: torch.Tensor, memory: torch.Tensor, batch_mask: torch.Tensor) -> torch.Tensor
¶
Decode the target tensor using the memory tensor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tgt |
Tensor
|
The target tensor of shape (batch_size, tgt_seq_length, d_model_tgt). |
required |
memory |
Tensor
|
The memory tensor of shape (batch_size, src_seq_length, d_model). |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The decoded tensor of shape (batch_size, tgt_seq_length, d_model). |
Source code in src/rydberggpt/models/transformer/models.py
encode(src: torch.Tensor) -> torch.Tensor
¶
Encode the source tensor.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
src |
Tensor
|
The source tensor of shape (batch_size, src_seq_length, d_model_src). |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The encoded tensor of shape (batch_size, src_seq_length, d_model_tgt). |
Source code in src/rydberggpt/models/transformer/models.py
forward(tgt: torch.Tensor, src: torch.Tensor) -> torch.Tensor
¶
Take in and process masked src and target sequences.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
tgt |
Tensor
|
The target tensor of shape (batch_size, tgt_seq_length, d_model_tgt). |
required |
src |
Tensor
|
The source tensor of shape (batch_size, src_seq_length, d_model_src). |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor after passing through the encoder-decoder architecture, with shape (batch_size, tgt_seq_length, d_model). |
Source code in src/rydberggpt/models/transformer/models.py
Generator
¶
Bases: Module
Linear + softmax layer for generation step. vocab_size for Rydberg is 2.
Source code in src/rydberggpt/models/transformer/models.py
__init__(d_model: int, vocab_size: int)
¶
Initialize the Generator class.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d_model |
int
|
The dimension of the input features (i.e., the last dimension of the input tensor). |
required |
vocab_size |
int
|
The size of the vocabulary, which determines the last dimension of the output tensor. |
required |
Source code in src/rydberggpt/models/transformer/models.py
forward(x: torch.Tensor) -> torch.Tensor
¶
Compute the forward pass of the Generator.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The input tensor of shape (batch_size, seq_length, d_model). |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor of shape (batch_size, seq_length, vocab_size), with log-softmax applied along the last dimension. |
Source code in src/rydberggpt/models/transformer/models.py
modules
¶
Embeddings
¶
Bases: Module
The embedding layer.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d_model |
int
|
The embedding size. |
required |
vocab_size |
int
|
The vocabulary size. |
required |
Source code in src/rydberggpt/models/transformer/modules.py
PositionalEncoding
¶
Bases: Module
Implement the PE function.
Source code in src/rydberggpt/models/transformer/modules.py
PositionwiseFeedForward
¶
Bases: Module
A two-layer feed-forward network.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
d_model |
int
|
The input size. |
required |
d_ff |
int
|
The hidden size. |
required |
dropout |
float
|
The dropout rate. Defaults to 0.1. |
0.1
|
Source code in src/rydberggpt/models/transformer/modules.py
SublayerConnection
¶
Bases: Module
This module implements a residual connection followed by a layer norm.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
size |
int
|
The input size. |
required |
dropout |
float
|
The dropout rate. |
required |
Source code in src/rydberggpt/models/transformer/modules.py
forward(x: torch.Tensor, sublayer: nn.Module) -> torch.Tensor
¶
Compute the forward pass through the module.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The input tensor. |
required |
sublayer |
Module
|
The sublayer module. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The output tensor. |
Source code in src/rydberggpt/models/transformer/modules.py
utils
¶
clones(module: nn.Module, n_clones: int)
¶
helper function which produces n_clones copies of a layer
flattened_snake_flip(x: torch.Tensor, Lx: int, Ly: int) -> torch.Tensor
¶
Implements a "snake" flip which reorders the flattened 2D tensor into snake order.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The tensor to apply the snake flip to, dimensions should be [..., Ly * Lx]. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The "snake" flipped tensor, dimensions will be [..., Ly * Lx]. |
Source code in src/rydberggpt/models/transformer/utils.py
snake_flip(x: torch.Tensor) -> torch.Tensor
¶
Implements a "snake" flip which reorders the 2D tensor into snake order when flattened.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
x |
Tensor
|
The tensor to apply the snake flip to, dimensions should be [..., Ly, Lx]. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
The "snake" flipped tensor, dimensions will be [..., Ly, Lx]. |