tfmdp.policy.layers package¶

Submodules¶

tfmdp.policy.layers.action_layer module¶

class tfmdp.policy.layers.action_layer.ActionLayer(action_size: int)¶

Bases: tensorflow.python.layers.base.Layer

ActionLayer should be used as the output layer in a DRP.

It generates multi-head dense output layers with the same shape as action fluents. Otionally, it restricts the output tensors based on action bounds.

Parameters:	action_size (Sequence[Sequence[int]]) – The list of action fluent sizes.

_get_output_tensor(tensor: tensorflow.python.framework.ops.Tensor, bounds: Tuple[Optional[tensorflow.python.framework.ops.Tensor], Optional[tensorflow.python.framework.ops.Tensor]]) → tensorflow.python.framework.ops.Tensor¶

Returns the value constrained output tensor.

Parameters:	tensor (tf.Tensor) – The layer’s output tensor corresponding to an action fluent. bounds (Tuple[Optional[tf.Tensor], Optional[tf.Tensor]]) – The action fluent bounds.
Returns:	the constrained output tensor.
Return type:	(tf.Tensor)

call(inputs: tensorflow.python.framework.ops.Tensor, action_bounds: Optional[Sequence[Tuple[Optional[tensorflow.python.framework.ops.Tensor], Optional[tensorflow.python.framework.ops.Tensor]]]] = None) → Sequence[tensorflow.python.framework.ops.Tensor]¶

Returns the tensors of the multi-head layer’s output.

Parameters:	inputs (tf.Tensor) – A hidden layer’s output. action_bounds (Optional[Sequence[Tuple[Optional[tf.Tensor], Optional[tf.Tensor]]]]) – The action bounds.
Returns:	A tuple of action tensors.
Return type:	Sequence[tf.Tensor]

trainable_variables¶: Returns the list of all layer variables/weights.

tfmdp.policy.layers.state_layer module¶

class tfmdp.policy.layers.state_layer.StateLayer(input_layer_norm: bool = False)¶

Bases: tensorflow.python.layers.base.Layer

StateLayer should be used as an input layer in a DRP.

It flattens each state fluent and returns a single concatenated tensor.

Parameters:	input_layer_norm (bool) – The boolean flag for enabling layer normalization.

call(inputs: Sequence[tensorflow.python.framework.ops.Tensor]) → tensorflow.python.framework.ops.Tensor¶

Returns the concatenation of all state fluent tensors previously flatten.

Parameters:	inputs (Sequence[tf.Tensor]) – A tuple of state fluent tensors.
Returns:	A single output tensor.
Return type:	tf.Tensor

trainable_variables¶: Returns the list of all layer variables/weights.

tfmdp.policy.layers package¶

Submodules¶

tfmdp.policy.layers.action_layer module¶

tfmdp.policy.layers.state_layer module¶

Module contents¶