`rau.unidirectional`¶

class rau.unidirectional.Unidirectional¶

Bases: Module

An API for unidirectional sequential neural networks (including RNNs and transformer decoders).

Let \(B\) be batch size, and \(n\) be the length of the input sequence.

class State¶

Bases: object

Represents the hidden state of the module after processing a certain number of inputs.

batch_size()¶

Get the batch size of the tensors in this state.

Return type:: int

detach()¶

Return a copy of this state with all tensors detached.

Return type:: State

fastforward(input_sequence)¶

Feed a sequence of inputs to this state and return the resulting state.

Parameters:: input_sequence (Tensor) – A \(B \times n \times \cdots\) tensor, representing \(n\) input tensors.
Return type:: State
Returns:: Updated state after reading input_sequence.

forward(input_sequence, include_first, return_state=False, return_output=True)¶

Like Unidirectional.forward(), but start with this state as the initial state.

This can often be done more efficiently than using next() iteratively.

Parameters:

input_sequence (Tensor) – A \(B \times n \times \cdots\) tensor, representing \(n\) input tensors.
return_state (bool) – Whether to return the last State of the module.
include_first (bool) – Whether to prepend an extra tensor to the beginning of the output corresponding to an output from this state, before reading the first input.

Return type:

Tensor | ForwardResult

Returns:

See Unidirectional.forward().

next(input_tensor)¶

Feed an input to this hidden state and produce the next hidden state.

Parameters:: input_tensor (Tensor) – A tensor of size \(B \times \cdots\), representing an input for a single timestep.
Return type:: State

output()¶

Get the output associated with this state.

For example, this can be the hidden state vector itself, or the hidden state passed through an affine transformation.

The return value is either a tensor or a tuple whose first element is a tensor. The other elements of the tuple can be used to return extra outputs.

Return type:: Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]
Returns:: A \(B \times \cdots\) tensor, or a tuple whose first element is a tensor. The other elements of the tuple can contain extra outputs. If there are any extra outputs, then the output of forward() and Unidirectional.forward() will contain the same number of extra outputs, where each extra output is a list containing all the outputs across all timesteps.

slice_batch(s)¶

Return a copy of this state with only certain batch elements included, determined by the slice s.

Parameters:: s (slice) – The slice object used to determine which batch elements to keep.
Return type:: State

transform_tensors(func)¶

Return a copy of this state with all tensors passed through a function.

Parameters:: func (Callable[[Tensor], Tensor]) – A function that will be applied to all tensors in this state.
Return type:: State

class StatefulComposedState¶

Bases: State

StatefulComposedState(parent: ‘Unidirectional’, first_is_main: bool, first_state: ‘Unidirectional.State’, second_state: ‘Unidirectional.State’)

__init__(parent, first_is_main, first_state, second_state)¶

batch_size()¶

Return type:: int

forward(input_sequence, include_first, return_state=False, return_output=True)¶

Return type:: Tensor | ForwardResult

next(input_tensor)¶

Return type:: State

output()¶

Return type:: Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

transform_tensors(func)¶

Return type:: State

parent: Unidirectional¶

first_is_main: bool¶

first_state: State¶

second_state: State¶

__init__(main=False, tags=None)¶

as_composable()¶

Return type:: Composable

forward(input_sequence, *args, initial_state=None, return_state=False, include_first=True, **kwargs)¶

Run this module on an entire sequence of inputs all at once.

This can often be done more efficiently than processing each input one by one.

Parameters:

input_sequence (Tensor) – A \(B \times n \times \cdots\) tensor representing a sequence of \(n\) input tensors.
initial_state (State | None) – An optional initial state to use instead of the default initial state created by initial_state().
return_state (bool) – Whether to return the last State of the module as an additional output. This state can be used to initialize a subsequent run.
include_first (bool) – Whether to prepend an extra tensor to the beginning of the output corresponding to a prediction for the first element in the input. If include_first is true, then the length of the output tensor will be \(n + 1\). Otherwise, it will be \(n\).
args (Any) – Extra arguments passed to initial_state().
kwargs (Any) – Extra arguments passed to initial_state().

Return type:

Tensor | ForwardResult

Returns:

A Tensor or a ForwardResult that contains the output tensor. The output tensor will be of size \(B \times n+1 \times \cdots\) if include_first is true and \(B \times n \times \cdots\) otherwise. If Unidirectional.State.output() returns extra outputs at each timestep, then they will be aggregated over all timesteps and returned as lists in ForwardResult.extra_outputs. If return_state is true, then the final State will be returned in ForwardResult.state. If there are no extra outputs and there is no state to return, just the output tensor is returned.

initial_composed_state(input_module, input_state, *args, **kwargs)¶

initial_state(batch_size, *args, **kwargs)¶

Get the initial state of the model.

Parameters:

batch_size (int) – Batch size.
args (Any) – Extra arguments passed from forward().
kwargs (Any) – Extra arguments passed from forward().

Return type:

State

Returns:

A state.

main()¶

Mark this module as main.

Return type:: Unidirectional
Returns:: Self.

tag(tag)¶

Add a tag to this module for argument routing.

Parameters:: tag (str) – Tag name.
Return type:: Unidirectional
Returns:: Self.

class rau.unidirectional.ForwardResult¶

Bases: object

The output of a call to Unidirectional.forward() or Unidirectional.State.forward().

__init__(output, extra_outputs, state)¶

output: Tensor | None¶: The main output tensor of the module.

extra_outputs: list[list[Any]]¶: A list of extra outputs returned alongside the main output.

state: State | None¶: An optional state representing the updated state of the module after reading the inputs.

class rau.unidirectional.StatelessUnidirectional¶

Bases: Unidirectional

A sequential module that has no temporal recurrence, but applies some function to every timestep.

class ComposedState¶

Bases: State

ComposedState(parent: ‘StatelessUnidirectional’, args: list[typing.Any], kwargs: dict[str, typing.Any], input_is_main: bool, use_initial_output: bool, input_state: rau.unidirectional.unidirectional.Unidirectional.State)

__init__(parent, args, kwargs, input_is_main, use_initial_output, input_state)¶

batch_size()¶

Return type:: int

forward(input_sequence, include_first, return_state=False, return_output=True)¶

Return type:: Tensor | ForwardResult

next(input_tensor)¶

Return type:: State

output()¶

Return type:: Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

transform_tensors(func)¶

Return type:: State

parent: StatelessUnidirectional¶

args: list[Any]¶

kwargs: dict[str, Any]¶

input_is_main: bool¶

use_initial_output: bool¶

input_state: State¶

class State¶

Bases: State

State(parent: ‘StatelessUnidirectional’, args: list[typing.Any], kwargs: dict[str, typing.Any], _batch_size: int | None, input_tensor: torch.Tensor | None)

__init__(parent, args, kwargs, _batch_size, input_tensor)¶

batch_size()¶

Return type:: int

forward(input_sequence, include_first, return_state=False, return_output=True)¶

Return type:: Tensor | ForwardResult

next(input_tensor)¶

Return type:: State

output()¶

Return type:: Tensor | tuple[Tensor, ...]

transform_tensors(func)¶

Return type:: State

parent: StatelessUnidirectional¶

args: list[Any]¶

kwargs: dict[str, Any]¶

input_tensor: Tensor | None¶

forward_sequence(input_sequence, *args, **kwargs)¶

Transform a sequence of tensors.

Parameters:: input_sequence (Tensor) – A tensor of size \(B \times n \times \cdots\) representing a sequence of tensors.
Return type:: Tensor
Returns:: A tensor of size \(B \times n \cdots\).

forward_single(input_tensor, *args, **kwargs)¶

Transform an input tensor for a single timestep.

Parameters:: input_tensor (Tensor) – A tensor of size \(B \times \cdots\) representing a tensor for a single timestep.
Return type:: Tensor
Returns:: A tensor of size \(B \times cdots\).

initial_composed_state(input_module, input_state, *args, **kwargs)¶

Return type:: State

initial_output(batch_size, *args, **kwargs)¶

Get the output of the initial state. By default, this simply raises an error.

Parameters:: batch_size (int) – Batch size.
Return type:: Tensor
Returns:: A tensor of size \(B \times \cdots\).

initial_state(batch_size, *args, **kwargs)¶

Return type:: State

transform_args(args, func)¶

Return type:: list[Any]

transform_kwargs(kwargs, func)¶

Return type:: dict[str, Any]

class rau.unidirectional.StatelessLayerUnidirectional¶

Bases: StatelessUnidirectional

__init__(func)¶

forward_sequence(input_sequence, *args, **kwargs)¶

Return type:: Tensor

forward_single(input_tensor, *args, **kwargs)¶

Return type:: Tensor

class rau.unidirectional.StatelessReshapingLayerUnidirectional¶

Bases: StatelessLayerUnidirectional

forward_single(input_tensor, *args, **kwargs)¶

Return type:: Tensor

class rau.unidirectional.PositionalUnidirectional¶

Bases: Unidirectional

class State¶

Bases: State

State(parent: ‘PositionalUnidirectional’, position: int, _batch_size: int | None, input_tensor: torch.Tensor | None)

__init__(parent, position, _batch_size, input_tensor)¶

batch_size()¶

Return type:: int

forward(input_sequence, include_first, return_state=False, return_output=True)¶

Return type:: Tensor | ForwardResult

next(input_tensor)¶

Return type:: State

output()¶

Return type:: Tensor | tuple[Tensor, Unpack[tuple[Any, ...]]]

transform_tensors(func)¶

Return type:: State

parent: PositionalUnidirectional¶

position: int¶

input_tensor: Tensor | None¶

forward_at_position(input_tensor, position)¶

Compute the output for a single input at a certain position.

Parameters:

input_tensor (Tensor) – A tensor of size \(B \times \cdots\) representing an input tensor for a single timestep.
position (int) – An index indicating the current timestep. The first timestep has index 0.

Return type:

Tensor

Returns:

A tensor of size \(B \times \cdots\) representing the output tensor corresponding to the input tensor.

forward_from_position(input_sequence, position)¶

Compute the outputs for a sequence of inputs, starting at a certain position.

Parameters:

input_sequence (Tensor) – A tensor of size \(B \times n \times \cdots\) representing a sequence of input tensors.
position (int) – An index indicating the timestep corresponding to the first input of input_sequence. The first timestep has index 0.

Return type: