examples/rllib/self_play_train.py error message

(base) nell@Jeremiah norm-games % python3 examples/rllib/self_play_train.py                                                        
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:18: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  DESCRIPTOR = _descriptor.FileDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:36: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _descriptor.FieldDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:29: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _TENSORSHAPEPROTO_DIM = _descriptor.Descriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:19: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  DESCRIPTOR = _descriptor.FileDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:33: DeprecationWarning: Call to deprecated create function EnumValueDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _descriptor.EnumValueDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:27: DeprecationWarning: Call to deprecated create function EnumDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _DATATYPE = _descriptor.EnumDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:287: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _descriptor.FieldDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:280: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _SERIALIZEDDTYPE = _descriptor.Descriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:20: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  DESCRIPTOR = _descriptor.FileDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:39: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _descriptor.FieldDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:32: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _RESOURCEHANDLEPROTO_DTYPEANDSHAPE = _descriptor.Descriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  DESCRIPTOR = _descriptor.FileDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _descriptor.FieldDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:33: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _TENSORPROTO = _descriptor.Descriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  DESCRIPTOR = _descriptor.FileDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
  _descriptor.FieldDescriptor(
/Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow_probability/python/__init__.py:57: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
  if (distutils.version.LooseVersion(tf.__version__) <
2023-01-30 13:47:50,054	INFO worker.py:1538 -- Started a local Ray instance.
2023-01-30 13:47:52,454	INFO algorithm_config.py:2503 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
2023-01-30 13:47:52,455	INFO algorithm_config.py:2503 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:18: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:36: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _descriptor.FieldDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:29: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _TENSORSHAPEPROTO_DIM = _descriptor.Descriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:19: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:33: DeprecationWarning: Call to deprecated create function EnumValueDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _descriptor.EnumValueDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:27: DeprecationWarning: Call to deprecated create function EnumDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _DATATYPE = _descriptor.EnumDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:287: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _descriptor.FieldDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:280: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _SERIALIZEDDTYPE = _descriptor.Descriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:20: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:39: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _descriptor.FieldDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:32: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _RESOURCEHANDLEPROTO_DTYPEANDSHAPE = _descriptor.Descriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _descriptor.FieldDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:33: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _TENSORPROTO = _descriptor.Descriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46962)   _descriptor.FieldDescriptor(
(pid=46962) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow_probability/python/__init__.py:57: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
(pid=46962)   if (distutils.version.LooseVersion(tf.__version__) <
(PPO pid=46962) 2023-01-30 13:47:56,215	WARNING algorithm_config.py:488 -- Cannot create PPOConfig from given `config_dict`! Property __stdout_file__ not supported.
(PPO pid=46962) 2023-01-30 13:47:56,216	INFO algorithm_config.py:2503 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:18: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:36: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _descriptor.FieldDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:29: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _TENSORSHAPEPROTO_DIM = _descriptor.Descriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:19: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:33: DeprecationWarning: Call to deprecated create function EnumValueDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _descriptor.EnumValueDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:27: DeprecationWarning: Call to deprecated create function EnumDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _DATATYPE = _descriptor.EnumDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:287: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _descriptor.FieldDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:280: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _SERIALIZEDDTYPE = _descriptor.Descriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:20: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:39: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _descriptor.FieldDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:32: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _RESOURCEHANDLEPROTO_DTYPEANDSHAPE = _descriptor.Descriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _descriptor.FieldDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:33: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _TENSORPROTO = _descriptor.Descriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46974)   _descriptor.FieldDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:18: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:36: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _descriptor.FieldDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_shape_pb2.py:29: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _TENSORSHAPEPROTO_DIM = _descriptor.Descriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:19: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:33: DeprecationWarning: Call to deprecated create function EnumValueDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _descriptor.EnumValueDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:27: DeprecationWarning: Call to deprecated create function EnumDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _DATATYPE = _descriptor.EnumDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:287: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _descriptor.FieldDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/types_pb2.py:280: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _SERIALIZEDDTYPE = _descriptor.Descriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:20: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:39: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _descriptor.FieldDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/resource_handle_pb2.py:32: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _RESOURCEHANDLEPROTO_DTYPEANDSHAPE = _descriptor.Descriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _descriptor.FieldDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/tensor_pb2.py:33: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _TENSORPROTO = _descriptor.Descriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:21: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   DESCRIPTOR = _descriptor.FileDescriptor(
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow/core/framework/attr_value_pb2.py:40: DeprecationWarning: Call to deprecated create function FieldDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
(pid=46975)   _descriptor.FieldDescriptor(
(pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow_probability/python/__init__.py:57: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
(pid=46974)   if (distutils.version.LooseVersion(tf.__version__) <
(pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/tensorflow_probability/python/__init__.py:57: DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
(pid=46975)   if (distutils.version.LooseVersion(tf.__version__) <
(RolloutWorker pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/gym/spaces/box.py:155: UserWarning: WARN: Casting input x to numpy array.
(RolloutWorker pid=46974)   logger.warn("Casting input x to numpy array.")
(RolloutWorker pid=46974) /Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/pre_checks/env.py:434: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
(RolloutWorker pid=46974) Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
(RolloutWorker pid=46974)   if not isinstance(done_, (bool, np.bool, np.bool_)):
(RolloutWorker pid=46974) 2023-01-30 13:48:00,643	DEBUG rollout_worker.py:1932 -- Creating policy for agent_0
(RolloutWorker pid=46974) 2023-01-30 13:48:00,645	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/gym/spaces/box.py:155: UserWarning: WARN: Casting input x to numpy array.
(RolloutWorker pid=46975)   logger.warn("Casting input x to numpy array.")
(RolloutWorker pid=46975) /Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/pre_checks/env.py:434: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
(RolloutWorker pid=46975) Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
(RolloutWorker pid=46975)   if not isinstance(done_, (bool, np.bool, np.bool_)):
(RolloutWorker pid=46975) 2023-01-30 13:48:00,682	DEBUG rollout_worker.py:1932 -- Creating policy for agent_0
(RolloutWorker pid=46975) 2023-01-30 13:48:00,684	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:00,956	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:00,957	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:00,957	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46974) 2023-01-30 13:48:00,957	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46975) 2023-01-30 13:48:00,993	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:00,993	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:00,993	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46975) 2023-01-30 13:48:00,993	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46974) Model: "model_5"
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_0_wk1/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_0_wk1/Sequen  [()]                0           ['tf_op_layer_agent_0_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46974)  yer)                                                                                             
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_0_wk1/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46974)  pLayer)                                                                                          
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_0_wk1/Sequen  [(None,)]           0           ['tf_op_layer_agent_0_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46974)  r)                                                                                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_0_wk1/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_0_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46974)  )                                                                                                
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_0_wk1/Sequen  [(None, None)]      0           ['tf_op_layer_agent_0_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46974)  )                                                                'tf_op_layer_agent_0_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46974)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46974)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46974)                                                                   'tf_op_layer_agent_0_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974) Total params: 535,817
(RolloutWorker pid=46974) Trainable params: 535,817
(RolloutWorker pid=46974) Non-trainable params: 0
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46975) Model: "model_5"
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46975)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_0_wk2/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_0_wk2/Sequen  [()]                0           ['tf_op_layer_agent_0_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46975)  yer)                                                                                             
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_0_wk2/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46975)  pLayer)                                                                                          
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_0_wk2/Sequen  [(None,)]           0           ['tf_op_layer_agent_0_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46975)  r)                                                                                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_0_wk2/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_0_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46975)  )                                                                                                
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_0_wk2/Sequen  [(None, None)]      0           ['tf_op_layer_agent_0_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46975)  )                                                                'tf_op_layer_agent_0_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46975)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46975)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46975)                                                                   'tf_op_layer_agent_0_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975) Total params: 535,817
(RolloutWorker pid=46975) Trainable params: 535,817
(RolloutWorker pid=46975) Non-trainable params: 0
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:01,060	INFO policy.py:1147 -- Policy (worker=1) running on CPU.
(RolloutWorker pid=46974) 2023-01-30 13:48:01,060	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) 2023-01-30 13:48:01,087	INFO policy.py:1147 -- Policy (worker=2) running on CPU.
(RolloutWorker pid=46975) 2023-01-30 13:48:01,087	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46974) 2023-01-30 13:48:01,126	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:01,126	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:01,127	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:01,127	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:01,127	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46975) 2023-01-30 13:48:01,152	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:01,152	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:01,153	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:01,153	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:01,153	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46974) 2023-01-30 13:48:01,344	DEBUG dynamic_tf_policy_v2.py:755 -- Initializing loss function with dummy input:
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974) { 'action_dist_inputs': <tf.Tensor 'agent_0_wk1/action_dist_inputs:0' shape=(?, 8) dtype=float32>,
(RolloutWorker pid=46974)   'action_logp': <tf.Tensor 'agent_0_wk1/action_logp:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'action_prob': <tf.Tensor 'agent_0_wk1/action_prob:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'actions': <tf.Tensor 'agent_0_wk1/actions:0' shape=(?,) dtype=int64>,
(RolloutWorker pid=46974)   'advantages': <tf.Tensor 'agent_0_wk1/advantages:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'agent_index': <tf.Tensor 'agent_0_wk1/agent_index:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'dones': <tf.Tensor 'agent_0_wk1/dones:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'eps_id': <tf.Tensor 'agent_0_wk1/eps_id:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'new_obs': <tf.Tensor 'agent_0_wk1/Placeholder_1:0' shape=(?, 23236) dtype=float32>,
(RolloutWorker pid=46974)   'obs': <tf.Tensor 'agent_0_wk1/Placeholder:0' shape=(?, 23236) dtype=float32>,
(RolloutWorker pid=46974)   'prev_actions': <tf.Tensor 'agent_0_wk1/prev_actions:0' shape=(?,) dtype=int64>,
(RolloutWorker pid=46974)   'prev_rewards': <tf.Tensor 'agent_0_wk1/prev_rewards:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'rewards': <tf.Tensor 'agent_0_wk1/rewards:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'seq_lens': <tf.Tensor 'agent_0_wk1/seq_lens:0' shape=(?,) dtype=int32>,
(RolloutWorker pid=46974)   'state_in_0': <tf.Tensor 'agent_0_wk1/state_in_0:0' shape=(?, 256) dtype=float32>,
(RolloutWorker pid=46974)   'state_in_1': <tf.Tensor 'agent_0_wk1/state_in_1:0' shape=(?, 256) dtype=float32>,
(RolloutWorker pid=46974)   't': <tf.Tensor 'agent_0_wk1/t:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'unroll_id': <tf.Tensor 'agent_0_wk1/unroll_id:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'value_targets': <tf.Tensor 'agent_0_wk1/value_targets:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'vf_preds': <tf.Tensor 'agent_0_wk1/vf_preds:0' shape=(?,) dtype=float32>}
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974) 2023-01-30 13:48:01,816	DEBUG tf_policy.py:783 -- These tensors were used in the loss functions:
(RolloutWorker pid=46974) { 'action_dist_inputs': <tf.Tensor 'agent_0_wk1/action_dist_inputs:0' shape=(?, 8) dtype=float32>,
(RolloutWorker pid=46974)   'action_logp': <tf.Tensor 'agent_0_wk1/action_logp:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'action_prob': <tf.Tensor 'agent_0_wk1/action_prob:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'actions': <tf.Tensor 'agent_0_wk1/actions:0' shape=(?,) dtype=int64>,
(RolloutWorker pid=46974)   'advantages': <tf.Tensor 'agent_0_wk1/advantages:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'dones': <tf.Tensor 'agent_0_wk1/dones:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'new_obs': <tf.Tensor 'agent_0_wk1/Placeholder_1:0' shape=(?, 23236) dtype=float32>,
(RolloutWorker pid=46974)   'obs': <tf.Tensor 'agent_0_wk1/Placeholder:0' shape=(?, 23236) dtype=float32>,
(RolloutWorker pid=46974)   'prev_actions': <tf.Tensor 'agent_0_wk1/prev_actions:0' shape=(?,) dtype=int64>,
(RolloutWorker pid=46974)   'rewards': <tf.Tensor 'agent_0_wk1/rewards:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'seq_lens': <tf.Tensor 'agent_0_wk1/seq_lens:0' shape=(?,) dtype=int32>,
(RolloutWorker pid=46974)   'state_in_0': <tf.Tensor 'agent_0_wk1/state_in_0:0' shape=(?, 256) dtype=float32>,
(RolloutWorker pid=46974)   'state_in_1': <tf.Tensor 'agent_0_wk1/state_in_1:0' shape=(?, 256) dtype=float32>,
(RolloutWorker pid=46974)   'value_targets': <tf.Tensor 'agent_0_wk1/value_targets:0' shape=(?,) dtype=float32>,
(RolloutWorker pid=46974)   'vf_preds': <tf.Tensor 'agent_0_wk1/vf_preds:0' shape=(?,) dtype=float32>}
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974) 2023-01-30 13:48:02,076	DEBUG rollout_worker.py:1932 -- Creating policy for agent_1
(RolloutWorker pid=46974) 2023-01-30 13:48:02,077	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46975) 2023-01-30 13:48:02,109	DEBUG rollout_worker.py:1932 -- Creating policy for agent_1
(RolloutWorker pid=46975) 2023-01-30 13:48:02,111	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:02,361	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:02,361	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:02,362	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46974) 2023-01-30 13:48:02,362	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46974) Model: "model_5"
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_1_wk1/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_1_wk1/Sequen  [()]                0           ['tf_op_layer_agent_1_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46974)  yer)                                                                                             
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_1_wk1/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46974)  pLayer)                                                                                          
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_1_wk1/Sequen  [(None,)]           0           ['tf_op_layer_agent_1_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46974)  r)                                                                                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_1_wk1/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_1_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46974)  )                                                                                                
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_1_wk1/Sequen  [(None, None)]      0           ['tf_op_layer_agent_1_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46974)  )                                                                'tf_op_layer_agent_1_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46974)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46974)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46974)                                                                   'tf_op_layer_agent_1_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974) Total params: 535,817
(RolloutWorker pid=46974) Trainable params: 535,817
(RolloutWorker pid=46974) Non-trainable params: 0
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:02,448	INFO policy.py:1147 -- Policy (worker=1) running on CPU.
(RolloutWorker pid=46974) 2023-01-30 13:48:02,448	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) 2023-01-30 13:48:02,385	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:02,385	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:02,385	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46975) 2023-01-30 13:48:02,385	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46975) 2023-01-30 13:48:02,469	INFO policy.py:1147 -- Policy (worker=2) running on CPU.
(RolloutWorker pid=46975) 2023-01-30 13:48:02,469	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) Model: "model_5"
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46975)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_1_wk2/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_1_wk2/Sequen  [()]                0           ['tf_op_layer_agent_1_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46975)  yer)                                                                                             
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_1_wk2/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46975)  pLayer)                                                                                          
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_1_wk2/Sequen  [(None,)]           0           ['tf_op_layer_agent_1_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46975)  r)                                                                                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_1_wk2/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_1_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46975)  )                                                                                                
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_1_wk2/Sequen  [(None, None)]      0           ['tf_op_layer_agent_1_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46975)  )                                                                'tf_op_layer_agent_1_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46975)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46975)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46975)                                                                   'tf_op_layer_agent_1_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975) Total params: 535,817
(RolloutWorker pid=46975) Trainable params: 535,817
(RolloutWorker pid=46975) Non-trainable params: 0
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:02,511	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:02,512	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:02,512	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:02,513	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:02,513	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46975) 2023-01-30 13:48:02,533	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:02,533	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:02,533	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:02,533	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:02,534	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46974) 2023-01-30 13:48:03,339	DEBUG rollout_worker.py:1932 -- Creating policy for agent_2
(RolloutWorker pid=46974) 2023-01-30 13:48:03,340	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46975) 2023-01-30 13:48:03,358	DEBUG rollout_worker.py:1932 -- Creating policy for agent_2
(RolloutWorker pid=46975) 2023-01-30 13:48:03,359	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:03,632	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:03,632	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:03,632	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46974) 2023-01-30 13:48:03,632	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46974) 2023-01-30 13:48:03,716	INFO policy.py:1147 -- Policy (worker=1) running on CPU.
(RolloutWorker pid=46974) 2023-01-30 13:48:03,717	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) 2023-01-30 13:48:03,634	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:03,634	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:03,634	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46975) 2023-01-30 13:48:03,634	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46975) 2023-01-30 13:48:03,718	INFO policy.py:1147 -- Policy (worker=2) running on CPU.
(RolloutWorker pid=46975) 2023-01-30 13:48:03,718	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46974) Model: "model_5"
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_2_wk1/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_2_wk1/Sequen  [()]                0           ['tf_op_layer_agent_2_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46974)  yer)                                                                                             
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_2_wk1/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46974)  pLayer)                                                                                          
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_2_wk1/Sequen  [(None,)]           0           ['tf_op_layer_agent_2_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46974)  r)                                                                                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_2_wk1/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_2_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46974)  )                                                                                                
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_2_wk1/Sequen  [(None, None)]      0           ['tf_op_layer_agent_2_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46974)  )                                                                'tf_op_layer_agent_2_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46974)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46974)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46974)                                                                   'tf_op_layer_agent_2_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974) Total params: 535,817
(RolloutWorker pid=46974) Trainable params: 535,817
(RolloutWorker pid=46974) Non-trainable params: 0
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46975) Model: "model_5"
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46975)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_2_wk2/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_2_wk2/Sequen  [()]                0           ['tf_op_layer_agent_2_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46975)  yer)                                                                                             
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_2_wk2/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46975)  pLayer)                                                                                          
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_2_wk2/Sequen  [(None,)]           0           ['tf_op_layer_agent_2_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46975)  r)                                                                                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_2_wk2/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_2_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46975)  )                                                                                                
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_2_wk2/Sequen  [(None, None)]      0           ['tf_op_layer_agent_2_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46975)  )                                                                'tf_op_layer_agent_2_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46975)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46975)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46975)                                                                   'tf_op_layer_agent_2_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975) Total params: 535,817
(RolloutWorker pid=46975) Trainable params: 535,817
(RolloutWorker pid=46975) Non-trainable params: 0
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:03,782	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:03,782	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:03,783	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:03,783	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:03,783	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46975) 2023-01-30 13:48:03,778	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:03,781	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:03,781	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:03,782	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:03,782	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46974) 2023-01-30 13:48:04,609	DEBUG rollout_worker.py:1932 -- Creating policy for agent_3
(RolloutWorker pid=46974) 2023-01-30 13:48:04,611	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46975) 2023-01-30 13:48:04,612	DEBUG rollout_worker.py:1932 -- Creating policy for agent_3
(RolloutWorker pid=46975) 2023-01-30 13:48:04,614	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:04,890	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:04,890	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:04,890	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46974) 2023-01-30 13:48:04,890	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46975) 2023-01-30 13:48:04,891	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:04,891	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:04,891	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46975) 2023-01-30 13:48:04,891	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46974) Model: "model_5"
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_3_wk1/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_3_wk1/Sequen  [()]                0           ['tf_op_layer_agent_3_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46974)  yer)                                                                                             
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_3_wk1/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46974)  pLayer)                                                                                          
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_3_wk1/Sequen  [(None,)]           0           ['tf_op_layer_agent_3_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46974)  r)                                                                                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_3_wk1/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_3_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46974)  )                                                                                                
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_3_wk1/Sequen  [(None, None)]      0           ['tf_op_layer_agent_3_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46974)  )                                                                'tf_op_layer_agent_3_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46974)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46974)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46974)                                                                   'tf_op_layer_agent_3_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974) Total params: 535,817
(RolloutWorker pid=46974) Trainable params: 535,817
(RolloutWorker pid=46974) Non-trainable params: 0
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46975) Model: "model_5"
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46975)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_3_wk2/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_3_wk2/Sequen  [()]                0           ['tf_op_layer_agent_3_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46975)  yer)                                                                                             
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_3_wk2/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46975)  pLayer)                                                                                          
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_3_wk2/Sequen  [(None,)]           0           ['tf_op_layer_agent_3_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46975)  r)                                                                                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_3_wk2/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_3_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46975)  )                                                                                                
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_3_wk2/Sequen  [(None, None)]      0           ['tf_op_layer_agent_3_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46975)  )                                                                'tf_op_layer_agent_3_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46975)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46975)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46975)                                                                   'tf_op_layer_agent_3_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975) Total params: 535,817
(RolloutWorker pid=46975) Trainable params: 535,817
(RolloutWorker pid=46975) Non-trainable params: 0
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:04,973	INFO policy.py:1147 -- Policy (worker=1) running on CPU.
(RolloutWorker pid=46974) 2023-01-30 13:48:04,973	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46974) 2023-01-30 13:48:05,035	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:05,035	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:05,035	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:05,035	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:05,036	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46975) 2023-01-30 13:48:04,975	INFO policy.py:1147 -- Policy (worker=2) running on CPU.
(RolloutWorker pid=46975) 2023-01-30 13:48:04,976	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) 2023-01-30 13:48:05,038	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:05,039	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:05,039	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:05,039	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:05,039	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46974) 2023-01-30 13:48:05,862	DEBUG rollout_worker.py:1932 -- Creating policy for agent_4
(RolloutWorker pid=46974) 2023-01-30 13:48:05,863	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46975) 2023-01-30 13:48:05,863	DEBUG rollout_worker.py:1932 -- Creating policy for agent_4
(RolloutWorker pid=46975) 2023-01-30 13:48:05,864	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:06,276	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:06,277	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:06,277	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46974) 2023-01-30 13:48:06,277	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46975) 2023-01-30 13:48:06,274	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:06,274	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:06,274	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46975) 2023-01-30 13:48:06,274	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46974) Model: "model_5"
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_4_wk1/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_4_wk1/Sequen  [()]                0           ['tf_op_layer_agent_4_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46974)  yer)                                                                                             
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_4_wk1/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46974)  pLayer)                                                                                          
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_4_wk1/Sequen  [(None,)]           0           ['tf_op_layer_agent_4_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46974)  r)                                                                                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_4_wk1/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_4_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46974)  )                                                                                                
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_4_wk1/Sequen  [(None, None)]      0           ['tf_op_layer_agent_4_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46974)  )                                                                'tf_op_layer_agent_4_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46974)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46974)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46974)                                                                   'tf_op_layer_agent_4_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974) Total params: 535,817
(RolloutWorker pid=46974) Trainable params: 535,817
(RolloutWorker pid=46974) Non-trainable params: 0
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46975) Model: "model_5"
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46975)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_4_wk2/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_4_wk2/Sequen  [()]                0           ['tf_op_layer_agent_4_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46975)  yer)                                                                                             
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_4_wk2/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46975)  pLayer)                                                                                          
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_4_wk2/Sequen  [(None,)]           0           ['tf_op_layer_agent_4_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46975)  r)                                                                                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_4_wk2/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_4_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46975)  )                                                                                                
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_4_wk2/Sequen  [(None, None)]      0           ['tf_op_layer_agent_4_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46975)  )                                                                'tf_op_layer_agent_4_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46975)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46975)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46975)                                                                   'tf_op_layer_agent_4_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975) Total params: 535,817
(RolloutWorker pid=46975) Trainable params: 535,817
(RolloutWorker pid=46975) Non-trainable params: 0
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:06,361	INFO policy.py:1147 -- Policy (worker=1) running on CPU.
(RolloutWorker pid=46974) 2023-01-30 13:48:06,361	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46974) 2023-01-30 13:48:06,423	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:06,424	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:06,424	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:06,424	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:06,424	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46975) 2023-01-30 13:48:06,359	INFO policy.py:1147 -- Policy (worker=2) running on CPU.
(RolloutWorker pid=46975) 2023-01-30 13:48:06,359	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) 2023-01-30 13:48:06,423	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:06,423	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:06,424	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:06,424	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:06,424	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46974) 2023-01-30 13:48:07,247	DEBUG rollout_worker.py:1932 -- Creating policy for agent_5
(RolloutWorker pid=46974) 2023-01-30 13:48:07,248	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46975) 2023-01-30 13:48:07,253	DEBUG rollout_worker.py:1932 -- Creating policy for agent_5
(RolloutWorker pid=46975) 2023-01-30 13:48:07,255	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:07,529	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:07,529	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:07,529	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46974) 2023-01-30 13:48:07,529	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46975) 2023-01-30 13:48:07,532	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:07,532	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:07,532	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46975) 2023-01-30 13:48:07,532	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46974) Model: "model_5"
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_5_wk1/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_5_wk1/Sequen  [()]                0           ['tf_op_layer_agent_5_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46974)  yer)                                                                                             
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_5_wk1/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46974)  pLayer)                                                                                          
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_5_wk1/Sequen  [(None,)]           0           ['tf_op_layer_agent_5_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46974)  r)                                                                                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_5_wk1/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_5_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46974)  )                                                                                                
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_5_wk1/Sequen  [(None, None)]      0           ['tf_op_layer_agent_5_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46974)  )                                                                'tf_op_layer_agent_5_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46974)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46974)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46974)                                                                   'tf_op_layer_agent_5_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974) Total params: 535,817
(RolloutWorker pid=46974) Trainable params: 535,817
(RolloutWorker pid=46974) Non-trainable params: 0
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46975) Model: "model_5"
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46975)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_5_wk2/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_5_wk2/Sequen  [()]                0           ['tf_op_layer_agent_5_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46975)  yer)                                                                                             
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_5_wk2/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46975)  pLayer)                                                                                          
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_5_wk2/Sequen  [(None,)]           0           ['tf_op_layer_agent_5_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46975)  r)                                                                                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_5_wk2/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_5_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46975)  )                                                                                                
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_5_wk2/Sequen  [(None, None)]      0           ['tf_op_layer_agent_5_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46975)  )                                                                'tf_op_layer_agent_5_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46975)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46975)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46975)                                                                   'tf_op_layer_agent_5_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975) Total params: 535,817
(RolloutWorker pid=46975) Trainable params: 535,817
(RolloutWorker pid=46975) Non-trainable params: 0
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:07,614	INFO policy.py:1147 -- Policy (worker=1) running on CPU.
(RolloutWorker pid=46974) 2023-01-30 13:48:07,614	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46974) 2023-01-30 13:48:07,675	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:07,676	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:07,676	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:07,676	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:07,676	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46975) 2023-01-30 13:48:07,616	INFO policy.py:1147 -- Policy (worker=2) running on CPU.
(RolloutWorker pid=46975) 2023-01-30 13:48:07,616	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) 2023-01-30 13:48:07,676	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:07,676	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:07,676	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:07,677	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:07,677	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46974) 2023-01-30 13:48:08,505	DEBUG rollout_worker.py:1932 -- Creating policy for agent_6
(RolloutWorker pid=46974) 2023-01-30 13:48:08,507	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46975) 2023-01-30 13:48:08,498	DEBUG rollout_worker.py:1932 -- Creating policy for agent_6
(RolloutWorker pid=46975) 2023-01-30 13:48:08,500	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 2, 'inter_op_parallelism_threads': 2, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:08,784	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:08,785	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46974) 2023-01-30 13:48:08,785	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46974) 2023-01-30 13:48:08,785	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46975) 2023-01-30 13:48:08,784	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:08,784	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(RolloutWorker pid=46975) 2023-01-30 13:48:08,784	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(RolloutWorker pid=46975) 2023-01-30 13:48:08,784	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(RolloutWorker pid=46974) Model: "model_5"
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46974)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_6_wk1/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_6_wk1/Sequen  [()]                0           ['tf_op_layer_agent_6_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46974)  yer)                                                                                             
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_6_wk1/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46974)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46974)  pLayer)                                                                                          
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_6_wk1/Sequen  [(None,)]           0           ['tf_op_layer_agent_6_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46974)  r)                                                                                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_6_wk1/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_6_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46974)  )                                                                                                
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  tf_op_layer_agent_6_wk1/Sequen  [(None, None)]      0           ['tf_op_layer_agent_6_wk1/Sequenc
(RolloutWorker pid=46974)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46974)  )                                                                'tf_op_layer_agent_6_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46974)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46974)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46974)                                                                   'tf_op_layer_agent_6_wk1/Sequenc
(RolloutWorker pid=46974)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46974)                                                                                                   
(RolloutWorker pid=46974) ==================================================================================================
(RolloutWorker pid=46974) Total params: 535,817
(RolloutWorker pid=46974) Trainable params: 535,817
(RolloutWorker pid=46974) Non-trainable params: 0
(RolloutWorker pid=46974) __________________________________________________________________________________________________
(RolloutWorker pid=46975) Model: "model_5"
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46975)  Layer (type)                   Output Shape         Param #     Connected to                     
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975)  seq_in (InputLayer)            [(None,)]            0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_6_wk2/Sequen  [()]                0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/Max (TensorFlowOpLayer)                                                                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_6_wk2/Sequen  [()]                0           ['tf_op_layer_agent_6_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Maximum (TensorFlowOpLa                                  eMask/Max[0][0]']                
(RolloutWorker pid=46975)  yer)                                                                                             
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_6_wk2/Sequen  [(None, 1)]         0           ['seq_in[0][0]']                 
(RolloutWorker pid=46975)  ceMask/ExpandDims (TensorFlowO                                                                   
(RolloutWorker pid=46975)  pLayer)                                                                                          
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_6_wk2/Sequen  [(None,)]           0           ['tf_op_layer_agent_6_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Range (TensorFlowOpLaye                                  eMask/Maximum[0][0]']            
(RolloutWorker pid=46975)  r)                                                                                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_6_wk2/Sequen  [(None, 1)]         0           ['tf_op_layer_agent_6_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Cast (TensorFlowOpLayer                                  eMask/ExpandDims[0][0]']         
(RolloutWorker pid=46975)  )                                                                                                
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  h (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  c (InputLayer)                 [(None, 256)]        0           []                               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  tf_op_layer_agent_6_wk2/Sequen  [(None, None)]      0           ['tf_op_layer_agent_6_wk2/Sequenc
(RolloutWorker pid=46975)  ceMask/Less (TensorFlowOpLayer                                  eMask/Range[0][0]',              
(RolloutWorker pid=46975)  )                                                                'tf_op_layer_agent_6_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Cast[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(RolloutWorker pid=46975)                                  (None, 256),                     'h[0][0]',                      
(RolloutWorker pid=46975)                                  (None, 256)]                     'c[0][0]',                      
(RolloutWorker pid=46975)                                                                   'tf_op_layer_agent_6_wk2/Sequenc
(RolloutWorker pid=46975)                                                                  eMask/Less[0][0]']               
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(RolloutWorker pid=46975)                                                                                                   
(RolloutWorker pid=46975) ==================================================================================================
(RolloutWorker pid=46975) Total params: 535,817
(RolloutWorker pid=46975) Trainable params: 535,817
(RolloutWorker pid=46975) Non-trainable params: 0
(RolloutWorker pid=46975) __________________________________________________________________________________________________
(RolloutWorker pid=46974) 2023-01-30 13:48:08,869	INFO policy.py:1147 -- Policy (worker=1) running on CPU.
(RolloutWorker pid=46974) 2023-01-30 13:48:08,870	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46974) 2023-01-30 13:48:08,931	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:08,931	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:08,932	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:08,932	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46974) 2023-01-30 13:48:08,932	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(RolloutWorker pid=46975) 2023-01-30 13:48:08,868	INFO policy.py:1147 -- Policy (worker=2) running on CPU.
(RolloutWorker pid=46975) 2023-01-30 13:48:08,868	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(RolloutWorker pid=46975) 2023-01-30 13:48:08,929	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:08,929	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:08,929	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:08,930	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(RolloutWorker pid=46975) 2023-01-30 13:48:08,930	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:09,770	INFO worker_set.py:309 -- Inferred observation/action spaces from remote worker (local worker has no env): {'agent_3': (Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), Discrete(8)), 'agent_0': (Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), Discrete(8)), 'agent_4': (Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), Discrete(8)), 'agent_2': (Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), Discrete(8)), 'agent_5': (Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), Discrete(8)), 'agent_6': (Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), Discrete(8)), 'agent_1': (Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), Discrete(8)), '__env__': (Dict(player_0:Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), player_1:Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), player_2:Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), player_3:Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), player_4:Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), player_5:Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)), player_6:Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8))), Dict(player_0:Discrete(8), player_1:Discrete(8), player_2:Discrete(8), player_3:Discrete(8), player_4:Discrete(8), player_5:Discrete(8), player_6:Discrete(8)))}
(PPO pid=46962) 2023-01-30 13:48:09,776	DEBUG rollout_worker.py:1932 -- Creating policy for agent_0
(PPO pid=46962) 2023-01-30 13:48:09,777	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:09,778	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:09,778	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:09,778	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:09,779	DEBUG catalog.py:813 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x1738c1270>: Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)) -> (23236,)
(PPO pid=46962) 2023-01-30 13:48:09,781	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 8, 'inter_op_parallelism_threads': 8, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(RolloutWorker pid=46974) 2023-01-30 13:48:09,753	DEBUG rollout_worker.py:841 -- Created rollout worker with env <ray.rllib.env.multi_agent_env.MultiAgentEnvWrapper object at 0x31aa5b250> (<MeltingPotEnv instance>), policies {}
(RolloutWorker pid=46975) 2023-01-30 13:48:09,761	DEBUG rollout_worker.py:841 -- Created rollout worker with env <ray.rllib.env.multi_agent_env.MultiAgentEnvWrapper object at 0x31595b040> (<MeltingPotEnv instance>), policies {}
(PPO pid=46962) 2023-01-30 13:48:10,100	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:10,100	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:10,100	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:10,100	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) Model: "model_5"
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962)  Layer (type)                   Output Shape         Param #     Connected to                     
(PPO pid=46962) ==================================================================================================
(PPO pid=46962)  seq_in (InputLayer)            [(None,)]            0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_0/SequenceMa  [()]                0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/Max (TensorFlowOpLayer)                                                                       
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_0/SequenceMa  [()]                0           ['tf_op_layer_agent_0/SequenceMas
(PPO pid=46962)  sk/Maximum (TensorFlowOpLayer)                                  k/Max[0][0]']                    
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_0/SequenceMa  [(None, 1)]         0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/ExpandDims (TensorFlowOpLay                                                                   
(PPO pid=46962)  er)                                                                                              
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_0/SequenceMa  [(None,)]           0           ['tf_op_layer_agent_0/SequenceMas
(PPO pid=46962)  sk/Range (TensorFlowOpLayer)                                    k/Maximum[0][0]']                
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_0/SequenceMa  [(None, 1)]         0           ['tf_op_layer_agent_0/SequenceMas
(PPO pid=46962)  sk/Cast (TensorFlowOpLayer)                                     k/ExpandDims[0][0]']             
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  h (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  c (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_0/SequenceMa  [(None, None)]      0           ['tf_op_layer_agent_0/SequenceMas
(PPO pid=46962)  sk/Less (TensorFlowOpLayer)                                     k/Range[0][0]',                  
(PPO pid=46962)                                                                   'tf_op_layer_agent_0/SequenceMas
(PPO pid=46962)                                                                  k/Cast[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(PPO pid=46962)                                  (None, 256),                     'h[0][0]',                      
(PPO pid=46962)                                  (None, 256)]                     'c[0][0]',                      
(PPO pid=46962)                                                                   'tf_op_layer_agent_0/SequenceMas
(PPO pid=46962)                                                                  k/Less[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962) ==================================================================================================
(PPO pid=46962) Total params: 535,817
(PPO pid=46962) Trainable params: 535,817
(PPO pid=46962) Non-trainable params: 0
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962) 2023-01-30 13:48:10,195	INFO policy.py:1147 -- Policy (worker=local) running on CPU.
(PPO pid=46962) 2023-01-30 13:48:10,195	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(PPO pid=46962) 2023-01-30 13:48:10,374	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:10,375	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:10,375	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:10,375	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:10,375	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:10,486	DEBUG dynamic_tf_policy_v2.py:755 -- Initializing loss function with dummy input:
(PPO pid=46962) 
(PPO pid=46962) { 'action_dist_inputs': <tf.Tensor 'agent_0/action_dist_inputs:0' shape=(?, 8) dtype=float32>,
(PPO pid=46962)   'action_logp': <tf.Tensor 'agent_0/action_logp:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'action_prob': <tf.Tensor 'agent_0/action_prob:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'actions': <tf.Tensor 'agent_0/actions:0' shape=(?,) dtype=int64>,
(PPO pid=46962)   'advantages': <tf.Tensor 'agent_0/advantages:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'agent_index': <tf.Tensor 'agent_0/agent_index:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'dones': <tf.Tensor 'agent_0/dones:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'eps_id': <tf.Tensor 'agent_0/eps_id:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'new_obs': <tf.Tensor 'agent_0/new_obs:0' shape=(?, 23236) dtype=float32>,
(PPO pid=46962)   'obs': <tf.Tensor 'agent_0/Placeholder:0' shape=(?, 23236) dtype=float32>,
(PPO pid=46962)   'prev_actions': <tf.Tensor 'agent_0/prev_actions:0' shape=(?,) dtype=int64>,
(PPO pid=46962)   'prev_rewards': <tf.Tensor 'agent_0/prev_rewards:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'rewards': <tf.Tensor 'agent_0/rewards:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'seq_lens': <tf.Tensor 'agent_0/seq_lens:0' shape=(?,) dtype=int32>,
(PPO pid=46962)   'state_in_0': <tf.Tensor 'agent_0/state_in_0:0' shape=(?, 256) dtype=float32>,
(PPO pid=46962)   'state_in_1': <tf.Tensor 'agent_0/state_in_1:0' shape=(?, 256) dtype=float32>,
(PPO pid=46962)   't': <tf.Tensor 'agent_0/t:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'unroll_id': <tf.Tensor 'agent_0/unroll_id:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'value_targets': <tf.Tensor 'agent_0/value_targets:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'vf_preds': <tf.Tensor 'agent_0/vf_preds:0' shape=(?,) dtype=float32>}
(PPO pid=46962) 
(PPO pid=46962) 2023-01-30 13:48:10,966	DEBUG tf_policy.py:783 -- These tensors were used in the loss functions:
(PPO pid=46962) { 'action_dist_inputs': <tf.Tensor 'agent_0/action_dist_inputs:0' shape=(?, 8) dtype=float32>,
(PPO pid=46962)   'action_logp': <tf.Tensor 'agent_0/action_logp:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'action_prob': <tf.Tensor 'agent_0/action_prob:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'actions': <tf.Tensor 'agent_0/actions:0' shape=(?,) dtype=int64>,
(PPO pid=46962)   'advantages': <tf.Tensor 'agent_0/advantages:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'dones': <tf.Tensor 'agent_0/dones:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'new_obs': <tf.Tensor 'agent_0/new_obs:0' shape=(?, 23236) dtype=float32>,
(PPO pid=46962)   'obs': <tf.Tensor 'agent_0/Placeholder:0' shape=(?, 23236) dtype=float32>,
(PPO pid=46962)   'prev_actions': <tf.Tensor 'agent_0/prev_actions:0' shape=(?,) dtype=int64>,
(PPO pid=46962)   'rewards': <tf.Tensor 'agent_0/rewards:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'seq_lens': <tf.Tensor 'agent_0/seq_lens:0' shape=(?,) dtype=int32>,
(PPO pid=46962)   'state_in_0': <tf.Tensor 'agent_0/state_in_0:0' shape=(?, 256) dtype=float32>,
(PPO pid=46962)   'state_in_1': <tf.Tensor 'agent_0/state_in_1:0' shape=(?, 256) dtype=float32>,
(PPO pid=46962)   'value_targets': <tf.Tensor 'agent_0/value_targets:0' shape=(?,) dtype=float32>,
(PPO pid=46962)   'vf_preds': <tf.Tensor 'agent_0/vf_preds:0' shape=(?,) dtype=float32>}
(PPO pid=46962) 
(PPO pid=46962) 2023-01-30 13:48:11,221	DEBUG rollout_worker.py:1932 -- Creating policy for agent_1
(PPO pid=46962) 2023-01-30 13:48:11,222	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:11,222	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:11,223	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:11,223	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:11,224	DEBUG catalog.py:813 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x1738c1510>: Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)) -> (23236,)
(PPO pid=46962) 2023-01-30 13:48:11,224	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 8, 'inter_op_parallelism_threads': 8, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(PPO pid=46962) 2023-01-30 13:48:11,502	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:11,502	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:11,502	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:11,502	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) Model: "model_5"
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962)  Layer (type)                   Output Shape         Param #     Connected to                     
(PPO pid=46962) ==================================================================================================
(PPO pid=46962)  seq_in (InputLayer)            [(None,)]            0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_1/SequenceMa  [()]                0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/Max (TensorFlowOpLayer)                                                                       
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_1/SequenceMa  [()]                0           ['tf_op_layer_agent_1/SequenceMas
(PPO pid=46962)  sk/Maximum (TensorFlowOpLayer)                                  k/Max[0][0]']                    
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_1/SequenceMa  [(None, 1)]         0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/ExpandDims (TensorFlowOpLay                                                                   
(PPO pid=46962)  er)                                                                                              
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_1/SequenceMa  [(None,)]           0           ['tf_op_layer_agent_1/SequenceMas
(PPO pid=46962)  sk/Range (TensorFlowOpLayer)                                    k/Maximum[0][0]']                
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_1/SequenceMa  [(None, 1)]         0           ['tf_op_layer_agent_1/SequenceMas
(PPO pid=46962)  sk/Cast (TensorFlowOpLayer)                                     k/ExpandDims[0][0]']             
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  h (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  c (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_1/SequenceMa  [(None, None)]      0           ['tf_op_layer_agent_1/SequenceMas
(PPO pid=46962)  sk/Less (TensorFlowOpLayer)                                     k/Range[0][0]',                  
(PPO pid=46962)                                                                   'tf_op_layer_agent_1/SequenceMas
(PPO pid=46962)                                                                  k/Cast[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(PPO pid=46962)                                  (None, 256),                     'h[0][0]',                      
(PPO pid=46962)                                  (None, 256)]                     'c[0][0]',                      
(PPO pid=46962)                                                                   'tf_op_layer_agent_1/SequenceMas
(PPO pid=46962)                                                                  k/Less[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962) ==================================================================================================
(PPO pid=46962) Total params: 535,817
(PPO pid=46962) Trainable params: 535,817
(PPO pid=46962) Non-trainable params: 0
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962) 2023-01-30 13:48:11,586	INFO policy.py:1147 -- Policy (worker=local) running on CPU.
(PPO pid=46962) 2023-01-30 13:48:11,586	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(PPO pid=46962) 2023-01-30 13:48:11,648	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:11,648	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:11,648	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:11,649	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:11,649	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:12,489	DEBUG rollout_worker.py:1932 -- Creating policy for agent_2
(PPO pid=46962) 2023-01-30 13:48:12,490	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:12,490	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:12,490	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:12,490	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:12,491	DEBUG catalog.py:813 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x17382b520>: Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)) -> (23236,)
(PPO pid=46962) 2023-01-30 13:48:12,492	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 8, 'inter_op_parallelism_threads': 8, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(PPO pid=46962) 2023-01-30 13:48:12,770	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:12,770	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:12,770	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:12,770	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) Model: "model_5"
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962)  Layer (type)                   Output Shape         Param #     Connected to                     
(PPO pid=46962) ==================================================================================================
(PPO pid=46962)  seq_in (InputLayer)            [(None,)]            0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_2/SequenceMa  [()]                0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/Max (TensorFlowOpLayer)                                                                       
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_2/SequenceMa  [()]                0           ['tf_op_layer_agent_2/SequenceMas
(PPO pid=46962)  sk/Maximum (TensorFlowOpLayer)                                  k/Max[0][0]']                    
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_2/SequenceMa  [(None, 1)]         0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/ExpandDims (TensorFlowOpLay                                                                   
(PPO pid=46962)  er)                                                                                              
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_2/SequenceMa  [(None,)]           0           ['tf_op_layer_agent_2/SequenceMas
(PPO pid=46962)  sk/Range (TensorFlowOpLayer)                                    k/Maximum[0][0]']                
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_2/SequenceMa  [(None, 1)]         0           ['tf_op_layer_agent_2/SequenceMas
(PPO pid=46962)  sk/Cast (TensorFlowOpLayer)                                     k/ExpandDims[0][0]']             
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  h (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  c (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_2/SequenceMa  [(None, None)]      0           ['tf_op_layer_agent_2/SequenceMas
(PPO pid=46962)  sk/Less (TensorFlowOpLayer)                                     k/Range[0][0]',                  
(PPO pid=46962)                                                                   'tf_op_layer_agent_2/SequenceMas
(PPO pid=46962)                                                                  k/Cast[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(PPO pid=46962)                                  (None, 256),                     'h[0][0]',                      
(PPO pid=46962)                                  (None, 256)]                     'c[0][0]',                      
(PPO pid=46962)                                                                   'tf_op_layer_agent_2/SequenceMas
(PPO pid=46962)                                                                  k/Less[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962) ==================================================================================================
(PPO pid=46962) Total params: 535,817
(PPO pid=46962) Trainable params: 535,817
(PPO pid=46962) Non-trainable params: 0
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962) 2023-01-30 13:48:12,855	INFO policy.py:1147 -- Policy (worker=local) running on CPU.
(PPO pid=46962) 2023-01-30 13:48:12,856	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(PPO pid=46962) 2023-01-30 13:48:12,917	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:12,918	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:12,918	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:12,918	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:12,918	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:13,759	DEBUG rollout_worker.py:1932 -- Creating policy for agent_3
(PPO pid=46962) 2023-01-30 13:48:13,760	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:13,760	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:13,760	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:13,761	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:13,762	DEBUG catalog.py:813 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x176753cd0>: Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)) -> (23236,)
(PPO pid=46962) 2023-01-30 13:48:13,762	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 8, 'inter_op_parallelism_threads': 8, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(PPO pid=46962) 2023-01-30 13:48:14,042	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:14,042	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:14,042	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:14,042	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) Model: "model_5"
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962)  Layer (type)                   Output Shape         Param #     Connected to                     
(PPO pid=46962) ==================================================================================================
(PPO pid=46962)  seq_in (InputLayer)            [(None,)]            0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_3/SequenceMa  [()]                0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/Max (TensorFlowOpLayer)                                                                       
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_3/SequenceMa  [()]                0           ['tf_op_layer_agent_3/SequenceMas
(PPO pid=46962)  sk/Maximum (TensorFlowOpLayer)                                  k/Max[0][0]']                    
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_3/SequenceMa  [(None, 1)]         0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/ExpandDims (TensorFlowOpLay                                                                   
(PPO pid=46962)  er)                                                                                              
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_3/SequenceMa  [(None,)]           0           ['tf_op_layer_agent_3/SequenceMas
(PPO pid=46962)  sk/Range (TensorFlowOpLayer)                                    k/Maximum[0][0]']                
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_3/SequenceMa  [(None, 1)]         0           ['tf_op_layer_agent_3/SequenceMas
(PPO pid=46962)  sk/Cast (TensorFlowOpLayer)                                     k/ExpandDims[0][0]']             
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  h (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  c (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_3/SequenceMa  [(None, None)]      0           ['tf_op_layer_agent_3/SequenceMas
(PPO pid=46962)  sk/Less (TensorFlowOpLayer)                                     k/Range[0][0]',                  
(PPO pid=46962)                                                                   'tf_op_layer_agent_3/SequenceMas
(PPO pid=46962)                                                                  k/Cast[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(PPO pid=46962)                                  (None, 256),                     'h[0][0]',                      
(PPO pid=46962)                                  (None, 256)]                     'c[0][0]',                      
(PPO pid=46962)                                                                   'tf_op_layer_agent_3/SequenceMas
(PPO pid=46962)                                                                  k/Less[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962) ==================================================================================================
(PPO pid=46962) Total params: 535,817
(PPO pid=46962) Trainable params: 535,817
(PPO pid=46962) Non-trainable params: 0
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962) 2023-01-30 13:48:14,126	INFO policy.py:1147 -- Policy (worker=local) running on CPU.
(PPO pid=46962) 2023-01-30 13:48:14,126	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(PPO pid=46962) 2023-01-30 13:48:14,188	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:14,189	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:14,189	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:14,189	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:14,189	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:15,020	DEBUG rollout_worker.py:1932 -- Creating policy for agent_4
(PPO pid=46962) 2023-01-30 13:48:15,021	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:15,021	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:15,021	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:15,021	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:15,022	DEBUG catalog.py:813 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x3114fe950>: Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)) -> (23236,)
(PPO pid=46962) 2023-01-30 13:48:15,023	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 8, 'inter_op_parallelism_threads': 8, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(PPO pid=46962) 2023-01-30 13:48:15,439	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:15,439	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:15,439	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:15,439	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) Model: "model_5"
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962)  Layer (type)                   Output Shape         Param #     Connected to                     
(PPO pid=46962) ==================================================================================================
(PPO pid=46962)  seq_in (InputLayer)            [(None,)]            0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_4/SequenceMa  [()]                0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/Max (TensorFlowOpLayer)                                                                       
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_4/SequenceMa  [()]                0           ['tf_op_layer_agent_4/SequenceMas
(PPO pid=46962)  sk/Maximum (TensorFlowOpLayer)                                  k/Max[0][0]']                    
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_4/SequenceMa  [(None, 1)]         0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/ExpandDims (TensorFlowOpLay                                                                   
(PPO pid=46962)  er)                                                                                              
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_4/SequenceMa  [(None,)]           0           ['tf_op_layer_agent_4/SequenceMas
(PPO pid=46962)  sk/Range (TensorFlowOpLayer)                                    k/Maximum[0][0]']                
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_4/SequenceMa  [(None, 1)]         0           ['tf_op_layer_agent_4/SequenceMas
(PPO pid=46962)  sk/Cast (TensorFlowOpLayer)                                     k/ExpandDims[0][0]']             
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  h (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  c (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_4/SequenceMa  [(None, None)]      0           ['tf_op_layer_agent_4/SequenceMas
(PPO pid=46962)  sk/Less (TensorFlowOpLayer)                                     k/Range[0][0]',                  
(PPO pid=46962)                                                                   'tf_op_layer_agent_4/SequenceMas
(PPO pid=46962)                                                                  k/Cast[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(PPO pid=46962)                                  (None, 256),                     'h[0][0]',                      
(PPO pid=46962)                                  (None, 256)]                     'c[0][0]',                      
(PPO pid=46962)                                                                   'tf_op_layer_agent_4/SequenceMas
(PPO pid=46962)                                                                  k/Less[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962) ==================================================================================================
(PPO pid=46962) Total params: 535,817
(PPO pid=46962) Trainable params: 535,817
(PPO pid=46962) Non-trainable params: 0
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962) 2023-01-30 13:48:15,526	INFO policy.py:1147 -- Policy (worker=local) running on CPU.
(PPO pid=46962) 2023-01-30 13:48:15,526	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(PPO pid=46962) 2023-01-30 13:48:15,588	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:15,588	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:15,589	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:15,589	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:15,589	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:16,429	DEBUG rollout_worker.py:1932 -- Creating policy for agent_5
(PPO pid=46962) 2023-01-30 13:48:16,430	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:16,430	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:16,430	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:16,430	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:16,431	DEBUG catalog.py:813 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x313046590>: Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)) -> (23236,)
(PPO pid=46962) 2023-01-30 13:48:16,432	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 8, 'inter_op_parallelism_threads': 8, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(PPO pid=46962) 2023-01-30 13:48:16,712	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:16,712	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:16,712	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:16,712	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) Model: "model_5"
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962)  Layer (type)                   Output Shape         Param #     Connected to                     
(PPO pid=46962) ==================================================================================================
(PPO pid=46962)  seq_in (InputLayer)            [(None,)]            0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_5/SequenceMa  [()]                0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/Max (TensorFlowOpLayer)                                                                       
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_5/SequenceMa  [()]                0           ['tf_op_layer_agent_5/SequenceMas
(PPO pid=46962)  sk/Maximum (TensorFlowOpLayer)                                  k/Max[0][0]']                    
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_5/SequenceMa  [(None, 1)]         0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/ExpandDims (TensorFlowOpLay                                                                   
(PPO pid=46962)  er)                                                                                              
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_5/SequenceMa  [(None,)]           0           ['tf_op_layer_agent_5/SequenceMas
(PPO pid=46962)  sk/Range (TensorFlowOpLayer)                                    k/Maximum[0][0]']                
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_5/SequenceMa  [(None, 1)]         0           ['tf_op_layer_agent_5/SequenceMas
(PPO pid=46962)  sk/Cast (TensorFlowOpLayer)                                     k/ExpandDims[0][0]']             
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  h (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  c (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_5/SequenceMa  [(None, None)]      0           ['tf_op_layer_agent_5/SequenceMas
(PPO pid=46962)  sk/Less (TensorFlowOpLayer)                                     k/Range[0][0]',                  
(PPO pid=46962)                                                                   'tf_op_layer_agent_5/SequenceMas
(PPO pid=46962)                                                                  k/Cast[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(PPO pid=46962)                                  (None, 256),                     'h[0][0]',                      
(PPO pid=46962)                                  (None, 256)]                     'c[0][0]',                      
(PPO pid=46962)                                                                   'tf_op_layer_agent_5/SequenceMas
(PPO pid=46962)                                                                  k/Less[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962) ==================================================================================================
(PPO pid=46962) Total params: 535,817
(PPO pid=46962) Trainable params: 535,817
(PPO pid=46962) Non-trainable params: 0
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962) 2023-01-30 13:48:16,800	INFO policy.py:1147 -- Policy (worker=local) running on CPU.
(PPO pid=46962) 2023-01-30 13:48:16,800	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(PPO pid=46962) 2023-01-30 13:48:16,861	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:16,862	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:16,862	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:16,862	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:16,862	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:17,701	DEBUG rollout_worker.py:1932 -- Creating policy for agent_6
(PPO pid=46962) 2023-01-30 13:48:17,702	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:17,702	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:17,702	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:17,702	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:17,703	DEBUG catalog.py:813 -- Created preprocessor <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x3146ec430>: Dict(ORIENTATION:Box(-2147483648, 2147483647, (), int32), POSITION:Box(-2147483648, 2147483647, (2,), int32), READY_TO_SHOOT:Box(-inf, inf, (), float64), RGB:Box(0, 255, (88, 88, 3), uint8)) -> (23236,)
(PPO pid=46962) 2023-01-30 13:48:17,704	DEBUG worker_set.py:938 -- Creating TF session {'intra_op_parallelism_threads': 8, 'inter_op_parallelism_threads': 8, 'gpu_options': {'allow_growth': True}, 'log_device_placement': False, 'device_count': {'CPU': 1}, 'allow_soft_placement': True}
(PPO pid=46962) 2023-01-30 13:48:17,986	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (), int32)
(PPO pid=46962) 2023-01-30 13:48:17,986	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-2147483648, 2147483647, (2,), int32)
(PPO pid=46962) 2023-01-30 13:48:17,986	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(-inf, inf, (), float64)
(PPO pid=46962) 2023-01-30 13:48:17,986	DEBUG preprocessors.py:272 -- Creating sub-preprocessor for Box(0, 255, (88, 88, 3), uint8)
(PPO pid=46962) 2023-01-30 13:48:18,069	INFO policy.py:1147 -- Policy (worker=local) running on CPU.
(PPO pid=46962) 2023-01-30 13:48:18,069	INFO tf_policy.py:171 -- Found 0 visible cuda devices.
(PPO pid=46962) Model: "model_5"
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962)  Layer (type)                   Output Shape         Param #     Connected to                     
(PPO pid=46962) ==================================================================================================
(PPO pid=46962)  seq_in (InputLayer)            [(None,)]            0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_6/SequenceMa  [()]                0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/Max (TensorFlowOpLayer)                                                                       
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_6/SequenceMa  [()]                0           ['tf_op_layer_agent_6/SequenceMas
(PPO pid=46962)  sk/Maximum (TensorFlowOpLayer)                                  k/Max[0][0]']                    
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_6/SequenceMa  [(None, 1)]         0           ['seq_in[0][0]']                 
(PPO pid=46962)  sk/ExpandDims (TensorFlowOpLay                                                                   
(PPO pid=46962)  er)                                                                                              
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_6/SequenceMa  [(None,)]           0           ['tf_op_layer_agent_6/SequenceMas
(PPO pid=46962)  sk/Range (TensorFlowOpLayer)                                    k/Maximum[0][0]']                
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_6/SequenceMa  [(None, 1)]         0           ['tf_op_layer_agent_6/SequenceMas
(PPO pid=46962)  sk/Cast (TensorFlowOpLayer)                                     k/ExpandDims[0][0]']             
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  inputs (InputLayer)            [(None, None, 264)]  0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  h (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  c (InputLayer)                 [(None, 256)]        0           []                               
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  tf_op_layer_agent_6/SequenceMa  [(None, None)]      0           ['tf_op_layer_agent_6/SequenceMas
(PPO pid=46962)  sk/Less (TensorFlowOpLayer)                                     k/Range[0][0]',                  
(PPO pid=46962)                                                                   'tf_op_layer_agent_6/SequenceMas
(PPO pid=46962)                                                                  k/Cast[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  lstm (LSTM)                    [(None, None, 256),  533504      ['inputs[0][0]',                 
(PPO pid=46962)                                  (None, 256),                     'h[0][0]',                      
(PPO pid=46962)                                  (None, 256)]                     'c[0][0]',                      
(PPO pid=46962)                                                                   'tf_op_layer_agent_6/SequenceMas
(PPO pid=46962)                                                                  k/Less[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  logits (Dense)                 (None, None, 8)      2056        ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962)  values (Dense)                 (None, None, 1)      257         ['lstm[0][0]']                   
(PPO pid=46962)                                                                                                   
(PPO pid=46962) ==================================================================================================
(PPO pid=46962) Total params: 535,817
(PPO pid=46962) Trainable params: 535,817
(PPO pid=46962) Non-trainable params: 0
(PPO pid=46962) __________________________________________________________________________________________________
(PPO pid=46962) 2023-01-30 13:48:18,130	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_prob` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:18,131	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_logp` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:18,131	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `action_dist_inputs` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:18,131	INFO dynamic_tf_policy_v2.py:709 -- Adding extra-action-fetch `vf_preds` to view-reqs.
(PPO pid=46962) 2023-01-30 13:48:18,131	INFO dynamic_tf_policy_v2.py:721 -- Testing `postprocess_trajectory` w/ dummy batch.
(PPO pid=46962) 2023-01-30 13:48:18,961	INFO rollout_worker.py:2004 -- Built policy map: {}
(PPO pid=46962) 2023-01-30 13:48:18,961	INFO rollout_worker.py:2005 -- Built preprocessor map: {'agent_0': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x1738c1270>, 'agent_1': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x1738c1510>, 'agent_2': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x17382b520>, 'agent_3': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x176753cd0>, 'agent_4': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x3114fe950>, 'agent_5': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x313046590>, 'agent_6': <ray.rllib.models.preprocessors.DictFlatteningPreprocessor object at 0x3146ec430>}
(PPO pid=46962) 2023-01-30 13:48:18,961	INFO rollout_worker.py:740 -- Built filter map: defaultdict(<class 'ray.rllib.utils.filter.NoFilter'>, {'agent_0': <ray.rllib.utils.filter.NoFilter object at 0x1738e41c0>, 'agent_1': <ray.rllib.utils.filter.NoFilter object at 0x3082d7f10>, 'agent_2': <ray.rllib.utils.filter.NoFilter object at 0x3128d83a0>, 'agent_3': <ray.rllib.utils.filter.NoFilter object at 0x313dc0940>, 'agent_4': <ray.rllib.utils.filter.NoFilter object at 0x31411e800>, 'agent_5': <ray.rllib.utils.filter.NoFilter object at 0x316d814b0>, 'agent_6': <ray.rllib.utils.filter.NoFilter object at 0x318269ae0>})
(PPO pid=46962) 2023-01-30 13:48:18,961	DEBUG rollout_worker.py:841 -- Created rollout worker with env None (None), policies {}
== Status ==
Current time: 2023-01-30 13:48:19 (running for 00:00:26.73)
Memory usage on this node: 11.4/16.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 3.0/8 CPUs, 0/0 GPUs, 0.0/6.42 GiB heap, 0.0/2.0 GiB objects
Result logdir: /Users/nell/ray_results/PPO
Number of trials: 1/1 (1 RUNNING)


(PPO pid=46962) 2023-01-30 13:48:19,170	INFO algorithm_config.py:2503 -- Your framework setting is 'tf', meaning you are using static-graph mode. Set framework='tf2' to enable eager execution with tf2.x. You may also then want to set eager_tracing=True in order to reach similar execution speed as with static-graph mode.
(PPO pid=46962) 2023-01-30 13:48:19,170	INFO trainable.py:172 -- Trainable.setup took 22.915 seconds. If your trainable is slow to initialize, consider setting reuse_actors=True to reduce actor creation overheads.
(PPO pid=46962) 2023-01-30 13:48:19,171	WARNING util.py:66 -- Install gputil for GPU system monitoring.
(RolloutWorker pid=46974) 2023-01-30 13:48:19,400	INFO rollout_worker.py:894 -- Generating sample batch of size 100
(RolloutWorker pid=46974) 2023-01-30 13:48:19,400	DEBUG sampler.py:631 -- No episode horizon specified, assuming inf.
(RolloutWorker pid=46975) 2023-01-30 13:48:19,424	DEBUG sampler.py:631 -- No episode horizon specified, assuming inf.
(RolloutWorker pid=46974) 2023-01-30 13:48:19,645	INFO sampler.py:664 -- Raw obs from env: { 0: { 'player_0': { 'ORIENTATION': np.ndarray((), dtype=int32, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'POSITION': np.ndarray((2,), dtype=int32, min=7.0, max=7.0, mean=7.0),
(RolloutWorker pid=46974)                      'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=37.0, max=255.0, mean=161.841)},
(RolloutWorker pid=46974)        'player_1': { 'ORIENTATION': np.ndarray((), dtype=int32, min=2.0, max=2.0, mean=2.0),
(RolloutWorker pid=46974)                      'POSITION': np.ndarray((2,), dtype=int32, min=7.0, max=16.0, mean=11.5),
(RolloutWorker pid=46974)                      'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=167.564)},
(RolloutWorker pid=46974)        'player_2': { 'ORIENTATION': np.ndarray((), dtype=int32, min=2.0, max=2.0, mean=2.0),
(RolloutWorker pid=46974)                      'POSITION': np.ndarray((2,), dtype=int32, min=14.0, max=19.0, mean=16.5),
(RolloutWorker pid=46974)                      'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=67.101)},
(RolloutWorker pid=46974)        'player_3': { 'ORIENTATION': np.ndarray((), dtype=int32, min=0.0, max=0.0, mean=0.0),
(RolloutWorker pid=46974)                      'POSITION': np.ndarray((2,), dtype=int32, min=7.0, max=16.0, mean=11.5),
(RolloutWorker pid=46974)                      'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=161.451)},
(RolloutWorker pid=46974)        'player_4': { 'ORIENTATION': np.ndarray((), dtype=int32, min=3.0, max=3.0, mean=3.0),
(RolloutWorker pid=46974)                      'POSITION': np.ndarray((2,), dtype=int32, min=14.0, max=14.0, mean=14.0),
(RolloutWorker pid=46974)                      'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=131.464)},
(RolloutWorker pid=46974)        'player_5': { 'ORIENTATION': np.ndarray((), dtype=int32, min=3.0, max=3.0, mean=3.0),
(RolloutWorker pid=46974)                      'POSITION': np.ndarray((2,), dtype=int32, min=10.0, max=16.0, mean=13.0),
(RolloutWorker pid=46974)                      'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=104.322)},
(RolloutWorker pid=46974)        'player_6': { 'ORIENTATION': np.ndarray((), dtype=int32, min=0.0, max=0.0, mean=0.0),
(RolloutWorker pid=46974)                      'POSITION': np.ndarray((2,), dtype=int32, min=3.0, max=14.0, mean=8.5),
(RolloutWorker pid=46974)                      'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                      'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=136.955)}}}
(RolloutWorker pid=46974) 2023-01-30 13:48:19,646	INFO sampler.py:665 -- Info return from env: {0: {}}
(RolloutWorker pid=46974) 2023-01-30 13:48:19,646	WARNING deprecation.py:47 -- DeprecationWarning: `policy_mapping_fn(agent_id)` has been deprecated. Use `policy_mapping_fn(agent_id, episode, worker, **kwargs)` instead. This will raise an error in the future!
(RolloutWorker pid=46974) 2023-01-30 13:48:19,646	INFO sampler.py:929 -- Filtered obs: { 'ORIENTATION': np.ndarray((), dtype=int32, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)   'POSITION': np.ndarray((2,), dtype=int32, min=7.0, max=7.0, mean=7.0),
(RolloutWorker pid=46974)   'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)   'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=37.0, max=255.0, mean=161.841)}
(RolloutWorker pid=46974) 2023-01-30 13:48:19,646	WARNING agent_collector.py:155 -- Provided tensor
(RolloutWorker pid=46974) {'READY_TO_SHOOT': array(1.), 'ORIENTATION': array(1, dtype=int32), 'POSITION': array([7, 7], dtype=int32), 'RGB': array([[[158, 194, 101],
(RolloutWorker pid=46974)         [158, 194, 101],
(RolloutWorker pid=46974)         [158, 194, 101],
(RolloutWorker pid=46974)         ...,
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         [220, 205, 185]],
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974)        [[158, 194, 101],
(RolloutWorker pid=46974)         [158, 194, 101],
(RolloutWorker pid=46974)         [158, 194, 101],
(RolloutWorker pid=46974)         ...,
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [210, 195, 175]],
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974)        [[158, 194, 101],
(RolloutWorker pid=46974)         [158, 194, 101],
(RolloutWorker pid=46974)         [ 53, 132,  49],
(RolloutWorker pid=46974)         ...,
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [220, 205, 185]],
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974)        ...,
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974)        [[210, 195, 175],
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         ...,
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         [220, 205, 185]],
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974)        [[220, 205, 185],
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         ...,
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [220, 205, 185]],
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974)        [[210, 195, 175],
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         ...,
(RolloutWorker pid=46974)         [220, 205, 185],
(RolloutWorker pid=46974)         [210, 195, 175],
(RolloutWorker pid=46974)         [220, 205, 185]]], dtype=uint8)}
(RolloutWorker pid=46974)  does not match space of view requirements obs.
(RolloutWorker pid=46974) Provided tensor has shape () and view requirement has shape shape None.Make sure dimensions match to resolve this warning.
(RolloutWorker pid=46974) 2023-01-30 13:48:19,648	INFO sampler.py:1187 -- Inputs to compute_actions():
(RolloutWorker pid=46974) 
(RolloutWorker pid=46974) { 'agent_0': [ { 'data': { 'agent_id': 'player_0',
(RolloutWorker pid=46974)                            'env_id': 0,
(RolloutWorker pid=46974)                            'info': {},
(RolloutWorker pid=46974)                            'obs': { 'ORIENTATION': np.ndarray((), dtype=int32, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'POSITION': np.ndarray((2,), dtype=int32, min=7.0, max=7.0, mean=7.0),
(RolloutWorker pid=46974)                                     'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=37.0, max=255.0, mean=161.841)},
(RolloutWorker pid=46974)                            'prev_action': None,
(RolloutWorker pid=46974)                            'prev_reward': 0.0,
(RolloutWorker pid=46974)                            'rnn_state': None},
(RolloutWorker pid=46974)                  'type': '_PolicyEvalData'}],
(RolloutWorker pid=46974)   'agent_1': [ { 'data': { 'agent_id': 'player_1',
(RolloutWorker pid=46974)                            'env_id': 0,
(RolloutWorker pid=46974)                            'info': {},
(RolloutWorker pid=46974)                            'obs': { 'ORIENTATION': np.ndarray((), dtype=int32, min=2.0, max=2.0, mean=2.0),
(RolloutWorker pid=46974)                                     'POSITION': np.ndarray((2,), dtype=int32, min=7.0, max=16.0, mean=11.5),
(RolloutWorker pid=46974)                                     'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=167.564)},
(RolloutWorker pid=46974)                            'prev_action': None,
(RolloutWorker pid=46974)                            'prev_reward': 0.0,
(RolloutWorker pid=46974)                            'rnn_state': None},
(RolloutWorker pid=46974)                  'type': '_PolicyEvalData'}],
(RolloutWorker pid=46974)   'agent_2': [ { 'data': { 'agent_id': 'player_2',
(RolloutWorker pid=46974)                            'env_id': 0,
(RolloutWorker pid=46974)                            'info': {},
(RolloutWorker pid=46974)                            'obs': { 'ORIENTATION': np.ndarray((), dtype=int32, min=2.0, max=2.0, mean=2.0),
(RolloutWorker pid=46974)                                     'POSITION': np.ndarray((2,), dtype=int32, min=14.0, max=19.0, mean=16.5),
(RolloutWorker pid=46974)                                     'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=67.101)},
(RolloutWorker pid=46974)                            'prev_action': None,
(RolloutWorker pid=46974)                            'prev_reward': 0.0,
(RolloutWorker pid=46974)                            'rnn_state': None},
(RolloutWorker pid=46974)                  'type': '_PolicyEvalData'}],
(RolloutWorker pid=46974)   'agent_3': [ { 'data': { 'agent_id': 'player_3',
(RolloutWorker pid=46974)                            'env_id': 0,
(RolloutWorker pid=46974)                            'info': {},
(RolloutWorker pid=46974)                            'obs': { 'ORIENTATION': np.ndarray((), dtype=int32, min=0.0, max=0.0, mean=0.0),
(RolloutWorker pid=46974)                                     'POSITION': np.ndarray((2,), dtype=int32, min=7.0, max=16.0, mean=11.5),
(RolloutWorker pid=46974)                                     'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=161.451)},
(RolloutWorker pid=46974)                            'prev_action': None,
(RolloutWorker pid=46974)                            'prev_reward': 0.0,
(RolloutWorker pid=46974)                            'rnn_state': None},
(RolloutWorker pid=46974)                  'type': '_PolicyEvalData'}],
(RolloutWorker pid=46974)   'agent_4': [ { 'data': { 'agent_id': 'player_4',
(RolloutWorker pid=46974)                            'env_id': 0,
(RolloutWorker pid=46974)                            'info': {},
(RolloutWorker pid=46974)                            'obs': { 'ORIENTATION': np.ndarray((), dtype=int32, min=3.0, max=3.0, mean=3.0),
(RolloutWorker pid=46974)                                     'POSITION': np.ndarray((2,), dtype=int32, min=14.0, max=14.0, mean=14.0),
(RolloutWorker pid=46974)                                     'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=131.464)},
(RolloutWorker pid=46974)                            'prev_action': None,
(RolloutWorker pid=46974)                            'prev_reward': 0.0,
(RolloutWorker pid=46974)                            'rnn_state': None},
(RolloutWorker pid=46974)                  'type': '_PolicyEvalData'}],
(RolloutWorker pid=46974)   'agent_5': [ { 'data': { 'agent_id': 'player_5',
(RolloutWorker pid=46974)                            'env_id': 0,
(RolloutWorker pid=46974)                            'info': {},
(RolloutWorker pid=46974)                            'obs': { 'ORIENTATION': np.ndarray((), dtype=int32, min=3.0, max=3.0, mean=3.0),
(RolloutWorker pid=46974)                                     'POSITION': np.ndarray((2,), dtype=int32, min=10.0, max=16.0, mean=13.0),
(RolloutWorker pid=46974)                                     'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=104.322)},
(RolloutWorker pid=46974)                            'prev_action': None,
(RolloutWorker pid=46974)                            'prev_reward': 0.0,
(RolloutWorker pid=46974)                            'rnn_state': None},
(RolloutWorker pid=46974)                  'type': '_PolicyEvalData'}],
(RolloutWorker pid=46974)   'agent_6': [ { 'data': { 'agent_id': 'player_6',
(RolloutWorker pid=46974)                            'env_id': 0,
(RolloutWorker pid=46974)                            'info': {},
(RolloutWorker pid=46974)                            'obs': { 'ORIENTATION': np.ndarray((), dtype=int32, min=0.0, max=0.0, mean=0.0),
(RolloutWorker pid=46974)                                     'POSITION': np.ndarray((2,), dtype=int32, min=3.0, max=14.0, mean=8.5),
(RolloutWorker pid=46974)                                     'READY_TO_SHOOT': np.ndarray((), dtype=float64, min=1.0, max=1.0, mean=1.0),
(RolloutWorker pid=46974)                                     'RGB': np.ndarray((88, 88, 3), dtype=uint8, min=0.0, max=255.0, mean=136.955)},
(RolloutWorker pid=46974)                            'prev_action': None,
(RolloutWorker pid=46974)                            'prev_reward': 0.0,
(RolloutWorker pid=46974)                            'rnn_state': None},
(RolloutWorker pid=46974)                  'type': '_PolicyEvalData'}]}
(RolloutWorker pid=46974) 
2023-01-30 13:48:19,693	ERROR trial_runner.py:1088 -- Trial PPO_meltingpot_4f50d_00000: Error processing event.
ray.exceptions.RayTaskError(ValueError): ray::PPO.train() (pid=46962, ip=127.0.0.1, repr=PPO)
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 367, in train
    raise skipped from exception_cause(skipped)
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/tune/trainable/trainable.py", line 364, in train
    result = self.step()
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 749, in step
    results, train_iter_ctx = self._run_one_training_iteration()
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/algorithms/algorithm.py", line 2623, in _run_one_training_iteration
    results = self.training_step()
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/algorithms/ppo/ppo.py", line 318, in training_step
    train_batch = synchronous_parallel_sample(
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 85, in synchronous_parallel_sample
    sample_batches = worker_set.foreach_worker(
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 696, in foreach_worker
    handle_remote_call_result_errors(remote_results, self._ignore_worker_failures)
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/worker_set.py", line 73, in handle_remote_call_result_errors
    raise r.get()
ray.exceptions.RayTaskError(ValueError): ray::RolloutWorker.apply() (pid=46974, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x17dd7bfd0>)
ValueError: The two structures don't have the same nested structure.

First structure: type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)

Second structure: type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [220, 205, 185],
         [210, 195, 175]],

        [[158, 194, 101],
         [158, 194, 101],
         [ 53, 132,  49],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        ...,

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[220, 205, 185],
         [210, 195, 175],
         [220, 205, 185],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [220, 205, 185],
         [210, 195, 175],
         [220, 205, 185]]]], dtype=uint8)}

More specifically: Substructure "type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [220, 205, 185],
         [210, 195, 175]],

        [[158, 194, 101],
         [158, 194, 101],
         [ 53, 132,  49],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        ...,

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[220, 205, 185],
         [210, 195, 175],
         [220, 205, 185],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [220, 205, 185],
         [210, 195, 175],
         [220, 205, 185]]]], dtype=uint8)}" is a sequence, while substructure "type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)" is not

During handling of the above exception, another exception occurred:

ray::RolloutWorker.apply() (pid=46974, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x17dd7bfd0>)
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 183, in apply
    raise e
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 174, in apply
    return func(self, *args, **kwargs)
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 86, in <lambda>
    lambda w: w.sample(), local_worker=False, healthy_only=True
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 900, in sample
    batches = [self.input_reader.next()]
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 92, in next
    batches = [self.get_data()]
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 285, in get_data
    item = next(self._env_runner)
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 694, in _env_runner
    eval_results = _do_policy_eval(
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 1207, in _do_policy_eval
    eval_results[policy_id] = policy.compute_actions_from_input_dict(
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/policy/tf_policy.py", line 321, in compute_actions_from_input_dict
    to_fetch = self._build_compute_actions(
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/policy/tf_policy.py", line 1087, in _build_compute_actions
    tree.map_structure(
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/tree/__init__.py", line 433, in map_structure
    assert_same_structure(structures[0], other, check_types=check_types)
  File "/Users/nell/miniforge3/lib/python3.10/site-packages/tree/__init__.py", line 288, in assert_same_structure
    raise type(e)("%s\n"
ValueError: The two structures don't have the same nested structure.

First structure: type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)

Second structure: type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [220, 205, 185],
         [210, 195, 175]],

        [[158, 194, 101],
         [158, 194, 101],
         [ 53, 132,  49],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        ...,

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[220, 205, 185],
         [210, 195, 175],
         [220, 205, 185],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [220, 205, 185],
         [210, 195, 175],
         [220, 205, 185]]]], dtype=uint8)}

More specifically: Substructure "type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[158, 194, 101],
         [158, 194, 101],
         [158, 194, 101],
         ...,
         [210, 195, 175],
         [220, 205, 185],
         [210, 195, 175]],

        [[158, 194, 101],
         [158, 194, 101],
         [ 53, 132,  49],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        ...,

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [210, 195, 175],
         [210, 195, 175],
         [220, 205, 185]],

        [[220, 205, 185],
         [210, 195, 175],
         [220, 205, 185],
         ...,
         [220, 205, 185],
         [220, 205, 185],
         [220, 205, 185]],

        [[210, 195, 175],
         [220, 205, 185],
         [210, 195, 175],
         ...,
         [220, 205, 185],
         [210, 195, 175],
         [220, 205, 185]]]], dtype=uint8)}" is a sequence, while substructure "type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)" is not
Entire first structure:
.
Entire second structure:
{'READY_TO_SHOOT': ., 'ORIENTATION': ., 'POSITION': ., 'RGB': .}
== Status ==
Current time: 2023-01-30 13:48:19 (running for 00:00:27.26)
Memory usage on this node: 11.5/16.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/6.42 GiB heap, 0.0/2.0 GiB objects
Result logdir: /Users/nell/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
Number of errored trials: 1
+----------------------------+--------------+----------------------------------------------------------------------------------------+
| Trial name                 |   # failures | error file                                                                             |
|----------------------------+--------------+----------------------------------------------------------------------------------------|
| PPO_meltingpot_4f50d_00000 |            1 | /Users/nell/ray_results/PPO/PPO_meltingpot_4f50d_00000_0_2023-01-30_13-47-52/error.txt |
+----------------------------+--------------+----------------------------------------------------------------------------------------+

== Status ==
Current time: 2023-01-30 13:48:19 (running for 00:00:27.27)
Memory usage on this node: 11.5/16.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/0 GPUs, 0.0/6.42 GiB heap, 0.0/2.0 GiB objects
Result logdir: /Users/nell/ray_results/PPO
Number of trials: 1/1 (1 ERROR)
Number of errored trials: 1
+----------------------------+--------------+----------------------------------------------------------------------------------------+
| Trial name                 |   # failures | error file                                                                             |
|----------------------------+--------------+----------------------------------------------------------------------------------------|
| PPO_meltingpot_4f50d_00000 |            1 | /Users/nell/ray_results/PPO/PPO_meltingpot_4f50d_00000_0_2023-01-30_13-47-52/error.txt |
+----------------------------+--------------+----------------------------------------------------------------------------------------+

(PPO pid=46962) 2023-01-30 13:48:19,688	ERROR actor_manager.py:486 -- Ray error, taking actor 1 out of service. ray::RolloutWorker.apply() (pid=46974, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x17dd7bfd0>)
(PPO pid=46962) ValueError: The two structures don't have the same nested structure.
(PPO pid=46962) 
(PPO pid=46962) First structure: type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)
(PPO pid=46962) 
(PPO pid=46962) Second structure: type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8)}
(PPO pid=46962) 
(PPO pid=46962) More specifically: Substructure "type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8)}" is a sequence, while substructure "type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)" is not
(PPO pid=46962) 
(PPO pid=46962) During handling of the above exception, another exception occurred:
(PPO pid=46962) 
(PPO pid=46962) ray::RolloutWorker.apply() (pid=46974, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x17dd7bfd0>)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 183, in apply
(PPO pid=46962)     raise e
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 174, in apply
(PPO pid=46962)     return func(self, *args, **kwargs)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 86, in <lambda>
(PPO pid=46962)     lambda w: w.sample(), local_worker=False, healthy_only=True
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 900, in sample
(PPO pid=46962)     batches = [self.input_reader.next()]
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 92, in next
(PPO pid=46962)     batches = [self.get_data()]
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 285, in get_data
(PPO pid=46962)     item = next(self._env_runner)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 694, in _env_runner
(PPO pid=46962)     eval_results = _do_policy_eval(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 1207, in _do_policy_eval
(PPO pid=46962)     eval_results[policy_id] = policy.compute_actions_from_input_dict(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/policy/tf_policy.py", line 321, in compute_actions_from_input_dict
(PPO pid=46962)     to_fetch = self._build_compute_actions(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/policy/tf_policy.py", line 1087, in _build_compute_actions
(PPO pid=46962)     tree.map_structure(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/tree/__init__.py", line 433, in map_structure
(PPO pid=46962)     assert_same_structure(structures[0], other, check_types=check_types)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/tree/__init__.py", line 288, in assert_same_structure
(PPO pid=46962)     raise type(e)("%s\n"
(PPO pid=46962) ValueError: The two structures don't have the same nested structure.
(PPO pid=46962) 
(PPO pid=46962) First structure: type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)
(PPO pid=46962) 
(PPO pid=46962) Second structure: type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8)}
(PPO pid=46962) 
(PPO pid=46962) More specifically: Substructure "type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'POSITION': array([[7, 7]], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8)}" is a sequence, while substructure "type=Tensor str=Tensor("agent_0_wk1/Placeholder:0", shape=(?, 23236), dtype=float32)" is not
(PPO pid=46962) Entire first structure:
(PPO pid=46962) .
(PPO pid=46962) Entire second structure:
(PPO pid=46962) {'READY_TO_SHOOT': ., 'ORIENTATION': ., 'POSITION': ., 'RGB': .}
(PPO pid=46962) 2023-01-30 13:48:19,688	ERROR actor_manager.py:486 -- Ray error, taking actor 2 out of service. ray::RolloutWorker.apply() (pid=46975, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x302a7ffd0>)
(PPO pid=46962) ValueError: The two structures don't have the same nested structure.
(PPO pid=46962) 
(PPO pid=46962) First structure: type=Tensor str=Tensor("agent_0_wk2/Placeholder:0", shape=(?, 23236), dtype=float32)
(PPO pid=46962) 
(PPO pid=46962) Second structure: type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8), 'POSITION': array([[7, 7]], dtype=int32)}
(PPO pid=46962) 
(PPO pid=46962) More specifically: Substructure "type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8), 'POSITION': array([[7, 7]], dtype=int32)}" is a sequence, while substructure "type=Tensor str=Tensor("agent_0_wk2/Placeholder:0", shape=(?, 23236), dtype=float32)" is not
(PPO pid=46962) 
(PPO pid=46962) During handling of the above exception, another exception occurred:
(PPO pid=46962) 
(PPO pid=46962) ray::RolloutWorker.apply() (pid=46975, ip=127.0.0.1, repr=<ray.rllib.evaluation.rollout_worker.RolloutWorker object at 0x302a7ffd0>)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 183, in apply
(PPO pid=46962)     raise e
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/utils/actor_manager.py", line 174, in apply
(PPO pid=46962)     return func(self, *args, **kwargs)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/execution/rollout_ops.py", line 86, in <lambda>
(PPO pid=46962)     lambda w: w.sample(), local_worker=False, healthy_only=True
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/rollout_worker.py", line 900, in sample
(PPO pid=46962)     batches = [self.input_reader.next()]
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 92, in next
(PPO pid=46962)     batches = [self.get_data()]
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 285, in get_data
(PPO pid=46962)     item = next(self._env_runner)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 694, in _env_runner
(PPO pid=46962)     eval_results = _do_policy_eval(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/evaluation/sampler.py", line 1207, in _do_policy_eval
(PPO pid=46962)     eval_results[policy_id] = policy.compute_actions_from_input_dict(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/policy/tf_policy.py", line 321, in compute_actions_from_input_dict
(PPO pid=46962)     to_fetch = self._build_compute_actions(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/ray/rllib/policy/tf_policy.py", line 1087, in _build_compute_actions
(PPO pid=46962)     tree.map_structure(
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/tree/__init__.py", line 433, in map_structure
(PPO pid=46962)     assert_same_structure(structures[0], other, check_types=check_types)
(PPO pid=46962)   File "/Users/nell/miniforge3/lib/python3.10/site-packages/tree/__init__.py", line 288, in assert_same_structure
(PPO pid=46962)     raise type(e)("%s\n"
(PPO pid=46962) ValueError: The two structures don't have the same nested structure.
(PPO pid=46962) 
(PPO pid=46962) First structure: type=Tensor str=Tensor("agent_0_wk2/Placeholder:0", shape=(?, 23236), dtype=float32)
(PPO pid=46962) 
(PPO pid=46962) Second structure: type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8), 'POSITION': array([[7, 7]], dtype=int32)}
(PPO pid=46962) 
(PPO pid=46962) More specifically: Substructure "type=dict str={'READY_TO_SHOOT': array([1.]), 'ORIENTATION': array([1], dtype=int32), 'RGB': array([[[[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175]],
(PPO pid=46962) 
(PPO pid=46962)         [[158, 194, 101],
(PPO pid=46962)          [158, 194, 101],
(PPO pid=46962)          [ 53, 132,  49],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         ...,
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [220, 205, 185]],
(PPO pid=46962) 
(PPO pid=46962)         [[210, 195, 175],
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          ...,
(PPO pid=46962)          [220, 205, 185],
(PPO pid=46962)          [210, 195, 175],
(PPO pid=46962)          [220, 205, 185]]]], dtype=uint8), 'POSITION': array([[7, 7]], dtype=int32)}" is a sequence, while substructure "type=Tensor str=Tensor("agent_0_wk2/Placeholder:0", shape=(?, 23236), dtype=float32)" is not
(PPO pid=46962) Entire first structure:
(PPO pid=46962) .
(PPO pid=46962) Entire second structure:
(PPO pid=46962) {'READY_TO_SHOOT': ., 'ORIENTATION': ., 'RGB': ., 'POSITION': .}
2023-01-30 13:48:20,055	ERROR tune.py:758 -- Trials did not complete: [PPO_meltingpot_4f50d_00000]
2023-01-30 13:48:20,056	INFO tune.py:762 -- Total run time: 27.62 seconds (27.27 seconds for the tuning loop).
<ray.tune.result_grid.ResultGrid object at 0x1780dbc10>
Traceback (most recent call last):
  File "/Users/nell/Documents/GitHub/norm-games/examples/rllib/self_play_train.py", line 159, in <module>
    main()
  File "/Users/nell/Documents/GitHub/norm-games/examples/rllib/self_play_train.py", line 155, in main
    assert results.num_errors == 0
AssertionError
Editor is loading...