rau.unidirectional

class rau.unidirectional.Unidirectional

Bases: Module

An API for unidirectional sequential neural networks (including RNNs and transformer decoders).

Let \(B\) be batch size, and \(n\) be the length of the input sequence.

class State

Bases: object

Represents the hidden state of the module after processing a certain number of inputs.

batch_size()

Get the batch size of the tensors in this state.

Return type:

int

detach()

Return a copy of this state with all tensors detached.

Return type:

State

fastforward(input_sequence)

Feed a sequence of inputs to this state and return the resulting state.

Parameters:

input_sequence (Tensor) – A \(B \times n \times \cdots\) tensor, representing \(n\) input tensors.

Return type:

State

Returns:

Updated state after reading input_sequence.

forward(input_sequence, include_first, return_state=False, return_output=True)

Like Unidirectional.forward(), but start with this state as the initial state.

This can often be done more efficiently than using next() iteratively.

Parameters:
  • input_sequence (Tensor) – A \(B \times n \times \cdots\) tensor, representing \(n\) input tensors.

  • return_state (bool) – Whether to return the last State of the module.

  • include_first (bool) – Whether to prepend an extra tensor to the beginning of the output corresponding to an output from this state, before reading the first input.

Return type:

Tensor | ForwardResult

Returns:

See Unidirectional.forward().

next(input_tensor)

Feed an input to this hidden state and produce the next hidden state.

Parameters:

input_tensor (Tensor) – A tensor of size \(B \times \cdots\), representing an input for a single timestep.

Return type:

State

output()

Get the output associated with this state.

For example, this can be the hidden state vector itself, or the hidden state passed through an affine transformation.

The return value is either a tensor or a tuple whose first element is a tensor. The other elements of the tuple can be used to return extra outputs.

Return type:

Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

Returns:

A \(B \times \cdots\) tensor, or a tuple whose first element is a tensor. The other elements of the tuple can contain extra outputs. If there are any extra outputs, then the output of forward() and Unidirectional.forward() will contain the same number of extra outputs, where each extra output is a list containing all the outputs across all timesteps.

slice_batch(s)

Return a copy of this state with only certain batch elements included, determined by the slice s.

Parameters:

s (slice) – The slice object used to determine which batch elements to keep.

Return type:

State

transform_tensors(func)

Return a copy of this state with all tensors passed through a function.

Parameters:

func (Callable[[Tensor], Tensor]) – A function that will be applied to all tensors in this state.

Return type:

State

class StatefulComposedState

Bases: State

StatefulComposedState(parent: ‘Unidirectional’, first_is_main: bool, first_state: ‘Unidirectional.State’, second_state: ‘Unidirectional.State’)

__init__(parent, first_is_main, first_state, second_state)
batch_size()
Return type:

int

forward(input_sequence, include_first, return_state=False, return_output=True)
Return type:

Tensor | ForwardResult

next(input_tensor)
Return type:

State

output()
Return type:

Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

transform_tensors(func)
Return type:

State

parent: Unidirectional
first_is_main: bool
first_state: State
second_state: State
__init__(main=False, tags=None)
as_composable()
Return type:

Composable

forward(input_sequence, *args, initial_state=None, return_state=False, include_first=True, **kwargs)

Run this module on an entire sequence of inputs all at once.

This can often be done more efficiently than processing each input one by one.

Parameters:
  • input_sequence (Tensor) – A \(B \times n \times \cdots\) tensor representing a sequence of \(n\) input tensors.

  • initial_state (State | None) – An optional initial state to use instead of the default initial state created by initial_state().

  • return_state (bool) – Whether to return the last State of the module as an additional output. This state can be used to initialize a subsequent run.

  • include_first (bool) – Whether to prepend an extra tensor to the beginning of the output corresponding to a prediction for the first element in the input. If include_first is true, then the length of the output tensor will be \(n + 1\). Otherwise, it will be \(n\).

  • args (Any) – Extra arguments passed to initial_state().

  • kwargs (Any) – Extra arguments passed to initial_state().

Return type:

Tensor | ForwardResult

Returns:

A Tensor or a ForwardResult that contains the output tensor. The output tensor will be of size \(B \times n+1 \times \cdots\) if include_first is true and \(B \times n \times \cdots\) otherwise. If Unidirectional.State.output() returns extra outputs at each timestep, then they will be aggregated over all timesteps and returned as lists in ForwardResult.extra_outputs. If return_state is true, then the final State will be returned in ForwardResult.state. If there are no extra outputs and there is no state to return, just the output tensor is returned.

initial_composed_state(input_module, input_state, *args, **kwargs)
initial_state(batch_size, *args, **kwargs)

Get the initial state of the model.

Parameters:
  • batch_size (int) – Batch size.

  • args (Any) – Extra arguments passed from forward().

  • kwargs (Any) – Extra arguments passed from forward().

Return type:

State

Returns:

A state.

main()

Mark this module as main.

Return type:

Unidirectional

Returns:

Self.

tag(tag)

Add a tag to this module for argument routing.

Parameters:

tag (str) – Tag name.

Return type:

Unidirectional

Returns:

Self.

class rau.unidirectional.ForwardResult

Bases: object

The output of a call to Unidirectional.forward() or Unidirectional.State.forward().

__init__(output, extra_outputs, state)
output: Tensor | None

The main output tensor of the module.

extra_outputs: list[list[Any]]

A list of extra outputs returned alongside the main output.

state: State | None

An optional state representing the updated state of the module after reading the inputs.

class rau.unidirectional.StatelessUnidirectional

Bases: Unidirectional

A sequential module that has no temporal recurrence, but applies some function to every timestep.

class ComposedState

Bases: State

ComposedState(parent: ‘StatelessUnidirectional’, args: list[typing.Any], kwargs: dict[str, typing.Any], input_is_main: bool, use_initial_output: bool, input_state: rau.unidirectional.unidirectional.Unidirectional.State)

__init__(parent, args, kwargs, input_is_main, use_initial_output, input_state)
batch_size()
Return type:

int

forward(input_sequence, include_first, return_state=False, return_output=True)
Return type:

Tensor | ForwardResult

next(input_tensor)
Return type:

State

output()
Return type:

Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

transform_tensors(func)
Return type:

State

parent: StatelessUnidirectional
args: list[Any]
kwargs: dict[str, Any]
input_is_main: bool
use_initial_output: bool
input_state: State
class State

Bases: State

State(parent: ‘StatelessUnidirectional’, args: list[typing.Any], kwargs: dict[str, typing.Any], _batch_size: int | None, input_tensor: torch.Tensor | None)

__init__(parent, args, kwargs, _batch_size, input_tensor)
batch_size()
Return type:

int

forward(input_sequence, include_first, return_state=False, return_output=True)
Return type:

Tensor | ForwardResult

next(input_tensor)
Return type:

State

output()
Return type:

Tensor | tuple[Tensor, ...]

transform_tensors(func)
Return type:

State

parent: StatelessUnidirectional
args: list[Any]
kwargs: dict[str, Any]
input_tensor: Tensor | None
forward_sequence(input_sequence, *args, **kwargs)

Transform a sequence of tensors.

Parameters:

input_sequence (Tensor) – A tensor of size \(B \times n \times \cdots\) representing a sequence of tensors.

Return type:

Tensor

Returns:

A tensor of size \(B \times n \cdots\).

forward_single(input_tensor, *args, **kwargs)

Transform an input tensor for a single timestep.

Parameters:

input_tensor (Tensor) – A tensor of size \(B \times \cdots\) representing a tensor for a single timestep.

Return type:

Tensor

Returns:

A tensor of size \(B \times cdots\).

initial_composed_state(input_module, input_state, *args, **kwargs)
Return type:

State

initial_output(batch_size, *args, **kwargs)

Get the output of the initial state. By default, this simply raises an error.

Parameters:

batch_size (int) – Batch size.

Return type:

Tensor

Returns:

A tensor of size \(B \times \cdots\).

initial_state(batch_size, *args, **kwargs)
Return type:

State

transform_args(args, func)
Return type:

list[Any]

transform_kwargs(kwargs, func)
Return type:

dict[str, Any]

class rau.unidirectional.StatelessLayerUnidirectional

Bases: StatelessUnidirectional

__init__(func)
forward_sequence(input_sequence, *args, **kwargs)
Return type:

Tensor

forward_single(input_tensor, *args, **kwargs)
Return type:

Tensor

class rau.unidirectional.StatelessReshapingLayerUnidirectional

Bases: StatelessLayerUnidirectional

forward_single(input_tensor, *args, **kwargs)
Return type:

Tensor

class rau.unidirectional.PositionalUnidirectional

Bases: Unidirectional

class State

Bases: State

State(parent: ‘PositionalUnidirectional’, position: int, _batch_size: int | None, input_tensor: torch.Tensor | None)

__init__(parent, position, _batch_size, input_tensor)
batch_size()
Return type:

int

forward(input_sequence, include_first, return_state=False, return_output=True)
Return type:

Tensor | ForwardResult

next(input_tensor)
Return type:

State

output()
Return type:

Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

transform_tensors(func)
Return type:

State

parent: PositionalUnidirectional
position: int
input_tensor: Tensor | None
forward_at_position(input_tensor, position)

Compute the output for a single input at a certain position.

Parameters:
  • input_tensor (Tensor) – A tensor of size \(B \times \cdots\) representing an input tensor for a single timestep.

  • position (int) – An index indicating the current timestep. The first timestep has index 0.

Return type:

Tensor

Returns:

A tensor of size \(B \times \cdots\) representing the output tensor corresponding to the input tensor.

forward_from_position(input_sequence, position)

Compute the outputs for a sequence of inputs, starting at a certain position.

Parameters:
  • input_sequence (Tensor) – A tensor of size \(B \times n \times \cdots\) representing a sequence of input tensors.

  • position (int) – An index indicating the timestep corresponding to the first input of input_sequence. The first timestep has index 0.

Return type:

Tensor

Returns:

A tensor of size \(B \times n' \times \cdots\) representing a sequence of output tensors.

initial_state(batch_size, *args, **kwargs)
Return type:

State

class rau.unidirectional.ComposedUnidirectional

Bases: Unidirectional

Stacks one undirectional model on another, so that the outputs of the first are fed as inputs to the second.

__init__(first, second)
forward(input_sequence, *args, initial_state=None, return_state=False, include_first=True, tag_kwargs=None, **kwargs)
Return type:

Tensor | ForwardResult

initial_state(batch_size, *args, tag_kwargs=None, **kwargs)
Return type:

State

class rau.unidirectional.DropoutUnidirectional

Bases: StatelessLayerUnidirectional

__init__(dropout)
class rau.unidirectional.EmbeddingUnidirectional

Bases: StatelessLayerUnidirectional

__init__(vocabulary_size, output_size, use_padding, shared_embeddings=None)
class rau.unidirectional.OutputUnidirectional

Bases: StatelessLayerUnidirectional

__init__(input_size, vocabulary_size, shared_embeddings=None, bias=True)
class rau.unidirectional.ResidualUnidirectional

Bases: Unidirectional

class State

Bases: State

State(input_tensor: torch.Tensor | None, wrapped_state: rau.unidirectional.unidirectional.Unidirectional.State)

__init__(input_tensor, wrapped_state)
batch_size()
Return type:

int

forward(input_sequence, include_first, return_state=False, return_output=True)
Return type:

Tensor | ForwardResult

next(input_tensor)
Return type:

State

output()
Return type:

Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

transform_tensors(func)
Return type:

State

input_tensor: Tensor | None
wrapped_state: State
__init__(module)
initial_state(batch_size, *args, **kwargs)
Return type:

State

class rau.unidirectional.StatelessResidualUnidirectional

Bases: StatelessUnidirectional

__init__(module)
forward_sequence(input_sequence, *args, **kwargs)
Return type:

Tensor

forward_single(input_tensor, *args, **kwargs)
Return type:

Tensor