diff --git a/.gitignore b/.gitignore
index 458e4c1ce..a067d4054 100644
--- a/.gitignore
+++ b/.gitignore
@@ -4,7 +4,7 @@
logs/
dist/
.eggs/
-xuanpolicy.egg-info/
+xuance.egg-info/
models/
*.sh
.VSCodeCounter/
diff --git a/LICENSE.txt b/LICENSE.txt
index f2aee945d..39b0ae484 100644
--- a/LICENSE.txt
+++ b/LICENSE.txt
@@ -1,5 +1,5 @@
MIT License
-Copyright (c) 2018 XuanPolicy contributors
+Copyright (c) 2018 XuanCe contributors
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
diff --git a/README.md b/README.md
index b458890c3..0ba91319f 100644
--- a/README.md
+++ b/README.md
@@ -1,13 +1,13 @@
![logo](./docs/source/figures/logo_1.png)
-# XuanPolicy: A Comprehensive and Unified Deep Reinforcement Learning Library
+# XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library
-[![PyPI](https://img.shields.io/pypi/v/xuanpolicy)](https://pypi.org/project/xuanpolicy/)
-[![Documentation Status](https://readthedocs.org/projects/xuanpolicy/badge/?version=latest)](https://xuanpolicy.readthedocs.io/en/latest/?badge=latest)
-![GitHub](https://img.shields.io/github/license/agi-brain/xuanpolicy)
-![GitHub Repo stars](https://img.shields.io/github/stars/agi-brain/xuanpolicy?style=social)
-![GitHub forks](https://img.shields.io/github/forks/agi-brain/xuanpolicy?style=social)
-![GitHub watchers](https://img.shields.io/github/watchers/agi-brain/xuanpolicy?style=social)
+[![PyPI](https://img.shields.io/pypi/v/xuance)](https://pypi.org/project/xuance/)
+[![Documentation Status](https://readthedocs.org/projects/xuance/badge/?version=latest)](https://xuance.readthedocs.io/en/latest/?badge=latest)
+![GitHub](https://img.shields.io/github/license/agi-brain/xuance)
+![GitHub Repo stars](https://img.shields.io/github/stars/agi-brain/xuance?style=social)
+![GitHub forks](https://img.shields.io/github/forks/agi-brain/xuance?style=social)
+![GitHub watchers](https://img.shields.io/github/watchers/agi-brain/xuance?style=social)
[![PyTorch](https://img.shields.io/badge/PyTorch-%3E%3D1.13.0-red)](https://pytorch.org/get-started/locally/)
[![TensorFlow](https://img.shields.io/badge/TensorFlow-%3E%3D2.6.0-orange)](https://www.tensorflow.org/install)
@@ -18,7 +18,7 @@
[![gymnasium](https://img.shields.io/badge/gymnasium-%3E%3D0.28.1-blue)](https://www.gymlibrary.dev/)
[![pettingzoo](https://img.shields.io/badge/PettingZoo-%3E%3D1.23.0-blue)](https://pettingzoo.farama.org/)
-**XuanPolicy** is an open-source ensemble of Deep Reinforcement Learning (DRL) algorithm implementations.
+**XuanCe** is an open-source ensemble of Deep Reinforcement Learning (DRL) algorithm implementations.
We call it as **Xuan-Ce (玄策)** in Chinese.
"**Xuan (玄)**" means incredible and magic box, "**Ce (策)**" means policy.
@@ -34,9 +34,9 @@ We expect it to be compatible with multiple deep learning toolboxes(
**[MindSpore](https://www.mindspore.cn/en)**),
and hope it can really become a zoo full of DRL algorithms.
-| **[Full Documentation](https://xuanpolicy.readthedocs.io/en/latest)** |
- **[中文文档](https://xuanpolicy.readthedocs.io/zh/latest/)** |
- **[OpenI (启智社区)](https://openi.pcl.ac.cn/OpenRelearnware/XuanPolicy)** |
+| **[Full Documentation](https://xuance.readthedocs.io/en/latest)** |
+ **[中文文档](https://xuance.readthedocs.io/zh/latest/)** |
+ **[OpenI (启智社区)](https://openi.pcl.ac.cn/OpenRelearnware/XuanCe)** |
**[XuanCe (Mini version)](https://github.com/wzcai99/xuance)** |
## Currently Included Algorithms
@@ -242,10 +242,10 @@ StarCraft Multi-Agentt Challenge.
The library can be run at Linux, Windows, MacOS, and EulerOS, etc.
-Before installing **XuanPolicy**, you should install [Anaconda](https://www.anaconda.com/download) to prepare a python environment.
+Before installing **XuanCe**, you should install [Anaconda](https://www.anaconda.com/download) to prepare a python environment.
(Note: select a proper version of Anaconda from [**here**](https://repo.anaconda.com/archive/).)
-After that, open a terminal and install **XuanPolicy** by the following steps.
+After that, open a terminal and install **XuanCe** by the following steps.
**Step 1**: Create a new conda environment (python>=3.7 is suggested):
@@ -262,14 +262,14 @@ conda activate xpolicy
**Step 3**: Install the library:
```commandline
-pip install xuanpolicy
+pip install xuance
```
-This command does not include the dependencies of deep learning toolboxes. To install the **XuanPolicy** with
-deep learning tools, you can type `pip install xuanpolicy[torch]` for [PyTorch](https://pytorch.org/get-started/locally/),
-`pip install xuanpolicy[tensorflow]` for [TensorFlow2](https://www.tensorflow.org/install),
-`pip install xuanpolicy[mindspore]` for [MindSpore](https://www.mindspore.cn/install/en),
-and `pip install xuanpolicy[all]` for all dependencies.
+This command does not include the dependencies of deep learning toolboxes. To install the **XuanCe** with
+deep learning tools, you can type `pip install xuance[torch]` for [PyTorch](https://pytorch.org/get-started/locally/),
+`pip install xuance[tensorflow]` for [TensorFlow2](https://www.tensorflow.org/install),
+`pip install xuance[mindspore]` for [MindSpore](https://www.mindspore.cn/install/en),
+and `pip install xuance[all]` for all dependencies.
Note: Some extra packages should be installed manually for further usage.
@@ -280,7 +280,7 @@ Note: Some extra packages should be installed manually for further usage.
#### Train a Model
```python
-import xuanpolicy as xp
+import xuance as xp
runner = xp.get_runner(method='dqn',
env='classic_control',
@@ -292,7 +292,7 @@ runner.run()
#### Test the Model
```python
-import xuanpolicy as xp
+import xuance as xp
runner_test = xp.get_runner(method='dqn',
env='classic_control',
@@ -485,8 +485,8 @@ $ tensorboard --logdir ./logs/dqn/torch/CartPole-v0
[//]: # (| micro_focus | | | | | | | | |)
```
-@misc{XuanPolicy2023,
- title={XuanPolicy: A Comprehensive and Unified Deep Reinforcement Learning Library},
+@misc{XuanCe2023,
+ title={XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library},
author={Wenzhang Liu, Wenzhe Cai, Kun Jiang, Guangran Cheng, Yuanda Wang,
Jiawei Wang, Jingyu Cao, Lele Xu, Chaoxu Mu, Changyin Sun},
publisher = {GitHub},
diff --git a/benchmark.py b/benchmark.py
index 89c49f04a..d6b49d5a6 100644
--- a/benchmark.py
+++ b/benchmark.py
@@ -1,5 +1,5 @@
import argparse
-from xuanpolicy import get_runner
+from xuance import get_runner
def parse_args():
diff --git a/benchmark_marl.py b/benchmark_marl.py
index 8fcacaa0f..6a458b4fd 100644
--- a/benchmark_marl.py
+++ b/benchmark_marl.py
@@ -1,5 +1,5 @@
import argparse
-from xuanpolicy import get_runner
+from xuance import get_runner
def parse_args():
diff --git a/demo.py b/demo.py
index d31e5fcf6..29253b3f2 100644
--- a/demo.py
+++ b/demo.py
@@ -1,5 +1,5 @@
import argparse
-from xuanpolicy import get_runner
+from xuance import get_runner
def parse_args():
diff --git a/demo_marl.py b/demo_marl.py
index 568da4a9c..9bc9298dd 100644
--- a/demo_marl.py
+++ b/demo_marl.py
@@ -1,7 +1,7 @@
import argparse
-import xuanpolicy.torch.agents
-from xuanpolicy import get_runner
+import xuance.torch.agents
+from xuance import get_runner
def parse_args():
diff --git a/docs/build/html/_sources/documents/api/agents/drl/a2c.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/a2c.rst.txt
index 104a6c7d7..65e154c60 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/a2c.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/a2c.rst.txt
@@ -8,12 +8,12 @@ A2C_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.a2c_agent.A2C_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ A2C_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent._action(obs)
+ xuance.torch.agent.policy_gradient.a2c_agent.A2C_Agent._action(obs)
Calculate actions according to the observations.
@@ -34,7 +34,7 @@ A2C_Agent
:rtype: np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.a2c_agent.A2C_Agent.train(train_steps)
Train the A2C agent.
@@ -42,7 +42,7 @@ A2C_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.a2c_agent.A2C_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -79,7 +79,7 @@ Source Code
import numpy as np
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class A2C_Agent(Agent):
diff --git a/docs/build/html/_sources/documents/api/agents/drl/basic_drl_class.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/basic_drl_class.rst.txt
index 89f81a73e..c75571048 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/basic_drl_class.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/basic_drl_class.rst.txt
@@ -1,23 +1,23 @@
Agent
=======================
-To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.agents.agent.Agent`` , ``xuanpolicy.tensorflow.agents.agent.Agent``, or ``xuanpolicy.mindspore.agents.agent.Agent``.
+To create a new Agent, you should build a class inherit from ``xuance.torch.agents.agent.Agent`` , ``xuance.tensorflow.agents.agent.Agent``, or ``xuance.mindspore.agents.agent.Agent``.
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agents.agent.Agent(config, envs, policy, memory, learner, device, log_dir, model_dir)
+ xuance.torch.agents.agent.Agent(config, envs, policy, memory, learner, device, log_dir, model_dir)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param memory: Experice replay buffer.
- :type memory: xuanpolicy.common.memory_tools.Buffer
+ :type memory: xuance.common.memory_tools.Buffer
:param learner: The learner that updates parameters of policy.
- :type learner: xuanpolicy.torch.learner.Learner
+ :type learner: xuance.torch.learner.Learner
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
:param log_dir: The directory of log file, default is "./logs/".
@@ -25,14 +25,14 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
:param model_dir: The directory of model file, default is "./models/".
:type model_dir: str
-.. py:function:: xuanpolicy.torch.agents.agent.Agent.save_model(model_name)
+.. py:function:: xuance.torch.agents.agent.Agent.save_model(model_name)
Save the model.
:param model_name: The model's name to be saved.
:type model_name: str
-.. py:function:: xuanpolicy.torch.agents.agent.Agent.load_model(path, seed)
+.. py:function:: xuance.torch.agents.agent.Agent.load_model(path, seed)
Load a model by specifying the ``path`` and ``seed`` .
@@ -41,7 +41,7 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
:param seed: Select the seed that model was trained with if it exits.
:type seed: int
-.. py:function:: xuanpolicy.torch.agents.agent.Agent.log_infos(info, x_index)
+.. py:function:: xuance.torch.agents.agent.Agent.log_infos(info, x_index)
Visualize the training information via wandb or tensorboard.
@@ -50,7 +50,7 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
:param x_index: Current step.
:type x_index: int
-.. py:function:: xuanpolicy.torch.agents.agent.Agent.log_videos(info, fps x_index)
+.. py:function:: xuance.torch.agents.agent.Agent.log_videos(info, fps x_index)
Visualize the interaction between agent and environment by uploading the videos with wandb or tensorboard.
@@ -61,7 +61,7 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
:param x_index: Current step.
:type x_index: int
-.. py:function:: xuanpolicy.torch.agents.agent.Agent._process_observation(observations)
+.. py:function:: xuance.torch.agents.agent.Agent._process_observation(observations)
Normalize the original observations.
@@ -70,7 +70,7 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
:return: The normalized observations.
:rtype: numpy.ndarray
-.. py:function:: xuanpolicy.torch.agents.agent.Agent._process_reward(rewards)
+.. py:function:: xuance.torch.agents.agent.Agent._process_reward(rewards)
Normalize the original rewards.
@@ -79,21 +79,21 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
:return: The normalized observations rewards.
:rtype: numpy.ndarray
-.. py:function:: xuanpolicy.torch.agents.agent.Agent._action(observations)
+.. py:function:: xuance.torch.agents.agent.Agent._action(observations)
Get actions for executing according to the observations.
:param observations: The original observations of agent.
:type observations: numpy.ndarray
-.. py:function:: xuanpolicy.torch.agents.agent.Agent.train(steps)
+.. py:function:: xuance.torch.agents.agent.Agent.train(steps)
Train the agents with ``steps`` steps.
:param steps: The training steps.
:type steps: int
-.. py:function:: xuanpolicy.torch.agents.agent.Agent.test(env_fn, steps)
+.. py:function:: xuance.torch.agents.agent.Agent.test(env_fn, steps)
Test the agents.
@@ -101,7 +101,7 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
:param steps: The training steps.
:type steps: int
-.. py:function:: xuanpolicy.torch.agents.agent.Agent.finish()
+.. py:function:: xuance.torch.agents.agent.Agent.finish()
Finish the wandb or tensorboard.
@@ -113,18 +113,18 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
**TensorFlow:**
.. py:class::
- xuanpolicy.tensorflowtensorflow.agent.agent.Agent(config, envs, policy, memory, learner, device, log_dir, model_dir)
+ xuance.tensorflowtensorflow.agent.agent.Agent(config, envs, policy, memory, learner, device, log_dir, model_dir)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param memory: Experice replay buffer.
- :type memory: xuanpolicy.common.memory_tools.Buffer
+ :type memory: xuance.common.memory_tools.Buffer
:param learner: The learner that updates parameters of policy.
- :type learner: xuanpolicy.tensorflow.learner.Learner
+ :type learner: xuance.tensorflow.learner.Learner
:param device: Choose CPU or GPU to train the model.
:type device: str
:param log_dir: The directory of log file, default is "./logs/".
@@ -140,16 +140,16 @@ To create a new Agent, you should build a class inherit from ``xuanpolicy.torch.
**MindSpore:**
.. py:class::
- xuanpolicy.mindsporetensorflow.agent.agent.Agent(envs, policy, memory, learner, device, log_dir, model_dir)
+ xuance.mindsporetensorflow.agent.agent.Agent(envs, policy, memory, learner, device, log_dir, model_dir)
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param memory: Experice replay buffer.
- :type memory: xuanpolicy.common.memory_tools.Buffer
+ :type memory: xuance.common.memory_tools.Buffer
:param learner: The learner that updates parameters of policy.
- :type learner: xuanpolicy.mindspore.learner.Learner
+ :type learner: xuance.mindspore.learner.Learner
:param device: Choose CPU or GPU to train the model.
:type device: str
:param log_dir: The directory of log file, default is "./logs/".
diff --git a/docs/build/html/_sources/documents/api/agents/drl/c51.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/c51.rst.txt
index 883effd27..d1a1c5051 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/c51.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/c51.rst.txt
@@ -8,12 +8,12 @@ C51_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.qlearning_family.c51_agent.C51_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ C51_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent._action(obs, egreedy)
+ xuance.torch.agent.qlearning_family.c51_agent.C51_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ C51_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent.train(train_steps)
+ xuance.torch.agent.qlearning_family.c51_agent.C51_Agent.train(train_steps)
Train the C51DQN agent.
@@ -44,7 +44,7 @@ C51_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.qlearning_family.c51_agent.C51_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -81,7 +81,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class C51_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/ddpg.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/ddpg.rst.txt
index a30811bf2..3bfb1f300 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/ddpg.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/ddpg.rst.txt
@@ -8,12 +8,12 @@ DDPG_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizers of actor and critic that update the parameters.
@@ -24,7 +24,7 @@ DDPG_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent._action(obs, noise_scale)
+ xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent._action(obs, noise_scale)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ DDPG_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.train(train_steps)
Train the DDPG agent.
@@ -44,7 +44,7 @@ DDPG_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -79,7 +79,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class DDPG_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/ddqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/ddqn.rst.txt
index 7d8888b59..3020c54e6 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/ddqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/ddqn.rst.txt
@@ -10,12 +10,12 @@ DQN with double q-learning trick.
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -26,7 +26,7 @@ DQN with double q-learning trick.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -38,7 +38,7 @@ DQN with double q-learning trick.
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.train(train_steps)
+ xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.train(train_steps)
Train the Double DQN agent.
@@ -46,7 +46,7 @@ DQN with double q-learning trick.
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -82,7 +82,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class DDQN_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/dqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/dqn.rst.txt
index 5fabbb4f0..89b9d2648 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/dqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/dqn.rst.txt
@@ -8,12 +8,12 @@ DQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.qlearning_family.dqn_agent.DQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.qlearning_family.dqn_agent.DQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ DQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.dqn_agent.DQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.qlearning_family.dqn_agent.DQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ DQN_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.dqn_agent.DQN_Agent.train(train_steps)
+ xuance.torch.agent.qlearning_family.dqn_agent.DQN_Agent.train(train_steps)
Train the DQN agent.
@@ -44,7 +44,7 @@ DQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.dqn_agent.DQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.qlearning_family.dqn_agent.DQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -80,7 +80,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class DQN_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/dueldqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/dueldqn.rst.txt
index 4ff3ac80e..37d7c38a2 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/dueldqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/dueldqn.rst.txt
@@ -8,12 +8,12 @@ DuelDQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ DuelDQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ DuelDQN_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent.train(train_steps)
+ xuance.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent.train(train_steps)
Train the Duel-DQN agent.
@@ -44,7 +44,7 @@ DuelDQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.qlearning_family.dueldqn_agent.DuelDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -80,7 +80,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class DuelDQN_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/mpdqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/mpdqn.rst.txt
index ef2d8f520..8197e602e 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/mpdqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/mpdqn.rst.txt
@@ -8,12 +8,12 @@ MPDQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ MPDQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ MPDQN_Agent
:rtype: np.ndarray, np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent.pad_action(disaction, conaction)
+ xuance.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent.pad_action(disaction, conaction)
:param disaction: The discrete actions.
:type disaction: numpy.ndarray
@@ -46,7 +46,7 @@ MPDQN_Agent
:rtype: tuple(numpy.ndarray, numpy.ndarray)
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent.train(train_steps)
Train the MPDQN agent.
@@ -54,7 +54,7 @@ MPDQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.mpdqn_agent.MPDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -90,7 +90,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
import gym
from gym import spaces
diff --git a/docs/build/html/_sources/documents/api/agents/drl/noisydqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/noisydqn.rst.txt
index ee9d4b0a3..884fac6fd 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/noisydqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/noisydqn.rst.txt
@@ -8,12 +8,12 @@ NoisyDQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ NoisyDQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ NoisyDQN_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent.train(train_steps)
+ xuance.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent.train(train_steps)
Train the Noisy-DQN agent.
@@ -44,7 +44,7 @@ NoisyDQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.qlearning_family.noisydqn_agent.NoisyDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -80,7 +80,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class NoisyDQN_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/pdqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/pdqn.rst.txt
index e8a2f96d2..3731a58a7 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/pdqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/pdqn.rst.txt
@@ -8,12 +8,12 @@ PDQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ PDQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ PDQN_Agent
:rtype: np.ndarray, np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent.pad_action(disaction, conaction)
+ xuance.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent.pad_action(disaction, conaction)
:param disaction: The discrete actions.
:type disaction: numpy.ndarray
@@ -46,7 +46,7 @@ PDQN_Agent
:rtype: tuple(numpy.ndarray, numpy.ndarray)
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent.train(train_steps)
Train the PDQN agent.
@@ -54,7 +54,7 @@ PDQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.pdqn_agent.PDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -92,7 +92,7 @@ Source Code
import numpy as np
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
import gym
from gym import spaces
diff --git a/docs/build/html/_sources/documents/api/agents/drl/perdqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/perdqn.rst.txt
index 2a079615a..c790164c8 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/perdqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/perdqn.rst.txt
@@ -8,12 +8,12 @@ PerDQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ PerDQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ PerDQN_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent.train(train_steps)
+ xuance.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent.train(train_steps)
Train the PerDQN agent.
@@ -44,7 +44,7 @@ PerDQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.qlearning_family.perdqn_agent.PerDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -80,7 +80,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class PerDQN_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/pg.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/pg.rst.txt
index b8bb9fb94..08f61ad93 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/pg.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/pg.rst.txt
@@ -8,12 +8,12 @@ PG_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.pg_agent.PG_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.pg_agent.PG_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ PG_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.pg_agent.PG_Agent._action(obs)
+ xuance.torch.agent.policy_gradient.pg_agent.PG_Agent._action(obs)
Calculate actions according to the observations.
@@ -34,7 +34,7 @@ PG_Agent
:rtype: np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.pg_agent.PG_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.pg_agent.PG_Agent.train(train_steps)
Train the PG agent.
@@ -42,7 +42,7 @@ PG_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.pg_agent.PG_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.pg_agent.PG_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -77,7 +77,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class PG_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/ppg.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/ppg.rst.txt
index dc083075f..2f30f5334 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/ppg.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/ppg.rst.txt
@@ -8,12 +8,12 @@ PPG_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.ppg_agent.PPG_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.ppg_agent.PPG_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ PPG_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppg_agent.PPG_Agent._action(obs)
+ xuance.torch.agent.policy_gradient.ppg_agent.PPG_Agent._action(obs)
Calculate actions according to the observations.
@@ -34,7 +34,7 @@ PPG_Agent
:rtype: np.ndarray, np.ndarray, torch.distributions
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppg_agent.PPG_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.ppg_agent.PPG_Agent.train(train_steps)
Train the PPG agent.
@@ -42,7 +42,7 @@ PPG_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppg_agent.PPG_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.ppg_agent.PPG_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -77,7 +77,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class PPG_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/ppo_clip.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/ppo_clip.rst.txt
index c9a9bfd78..acfad893d 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/ppo_clip.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/ppo_clip.rst.txt
@@ -8,12 +8,12 @@ PPOCLIP_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ PPOCLIP_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent._action(obs)
+ xuance.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent._action(obs)
Calculate actions according to the observations.
@@ -34,7 +34,7 @@ PPOCLIP_Agent
:rtype: np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent.train(train_steps)
Train the PPO agent.
@@ -42,7 +42,7 @@ PPOCLIP_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.ppoclip_agent.PPOCLIP_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -77,7 +77,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class PPOCLIP_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/ppo_kl.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/ppo_kl.rst.txt
index e42e1a6fb..5de10a65f 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/ppo_kl.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/ppo_kl.rst.txt
@@ -8,12 +8,12 @@ PPOKL_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ PPOKL_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent._action(obs)
+ xuance.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent._action(obs)
Calculate actions according to the observations.
@@ -34,7 +34,7 @@ PPOKL_Agent
:rtype: np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent.train(train_steps)
Train the PPO agent.
@@ -42,7 +42,7 @@ PPOKL_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.ppokl_agent.PPOKL_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -77,7 +77,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class PPOKL_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/qrdqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/qrdqn.rst.txt
index 435e756e5..453ee9f44 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/qrdqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/qrdqn.rst.txt
@@ -8,12 +8,12 @@ QRDQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ QRDQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ QRDQN_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent.train(train_steps)
+ xuance.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent.train(train_steps)
Train the QRDQN agent.
@@ -44,7 +44,7 @@ QRDQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.qlearning_family.qrdqn_agent.QRDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -80,7 +80,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class QRDQN_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/sac.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/sac.rst.txt
index 5852eee47..f3be20a87 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/sac.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/sac.rst.txt
@@ -8,12 +8,12 @@ SAC_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.sac_agent.SAC_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.sac_agent.SAC_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizers of actor and critic that update the parameters.
@@ -24,7 +24,7 @@ SAC_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.sac_agent.SAC_Agent._action(obs)
+ xuance.torch.agent.policy_gradient.sac_agent.SAC_Agent._action(obs)
Calculate actions according to the observations.
@@ -34,7 +34,7 @@ SAC_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.sac_agent.SAC_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.sac_agent.SAC_Agent.train(train_steps)
Train the SAC agent.
@@ -42,7 +42,7 @@ SAC_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.sac_agent.SAC_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.sac_agent.SAC_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -77,7 +77,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class SAC_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/sac_dis.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/sac_dis.rst.txt
index 998543749..8fd487471 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/sac_dis.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/sac_dis.rst.txt
@@ -8,12 +8,12 @@ SACDIS_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizers of actor and critic that update the parameters.
@@ -24,7 +24,7 @@ SACDIS_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent._action(obs)
+ xuance.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent._action(obs)
Calculate actions according to the observations.
@@ -34,7 +34,7 @@ SACDIS_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent.train(train_steps)
Train the SACDIS agent.
@@ -42,7 +42,7 @@ SACDIS_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.sacdis_agent.SACDIS_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -77,7 +77,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class SACDIS_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/drl/spdqn.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/spdqn.rst.txt
index d5edb3279..47056dc42 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/spdqn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/spdqn.rst.txt
@@ -8,12 +8,12 @@ SPDQN_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizer that updates the parameters.
@@ -24,7 +24,7 @@ SPDQN_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent._action(obs, egreedy)
+ xuance.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent._action(obs, egreedy)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ SPDQN_Agent
:rtype: np.ndarray, np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent.pad_action(disaction, conaction)
+ xuance.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent.pad_action(disaction, conaction)
:param disaction: The discrete actions.
:type disaction: numpy.ndarray
@@ -46,7 +46,7 @@ SPDQN_Agent
:rtype: tuple(numpy.ndarray, numpy.ndarray)
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent.train(train_steps)
Train the SPDQN agent.
@@ -54,7 +54,7 @@ SPDQN_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.spdqn_agent.SPDQN_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -90,7 +90,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
import gym
from gym import spaces
diff --git a/docs/build/html/_sources/documents/api/agents/drl/td3.rst.txt b/docs/build/html/_sources/documents/api/agents/drl/td3.rst.txt
index dc26c8b92..a695e1d57 100644
--- a/docs/build/html/_sources/documents/api/agents/drl/td3.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/drl/td3.rst.txt
@@ -8,12 +8,12 @@ TD3_Agent
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.policy_gradient.td3_agent.TD3_Agent(config, envs, policy, optimizer, scheduler, device)
+ xuance.torch.agent.policy_gradient.td3_agent.TD3_Agent(config, envs, policy, optimizer, scheduler, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param optimizer: The optimizers of actor and critic that update the parameters.
@@ -24,7 +24,7 @@ TD3_Agent
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.td3_agent.TD3_Agent._action(obs, noise_scale)
+ xuance.torch.agent.policy_gradient.td3_agent.TD3_Agent._action(obs, noise_scale)
Calculate actions according to the observations.
@@ -36,7 +36,7 @@ TD3_Agent
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.td3_agent.TD3_Agent.train(train_steps)
+ xuance.torch.agent.policy_gradient.td3_agent.TD3_Agent.train(train_steps)
Train the TD3 agent.
@@ -44,7 +44,7 @@ TD3_Agent
:type train_steps: int
.. py:function::
- xuanpolicy.torch.agent.policy_gradient.td3_agent.TD3_Agent.test(env_fn, test_episodes)
+ xuance.torch.agent.policy_gradient.td3_agent.TD3_Agent.test(env_fn, test_episodes)
Test the trained model.
@@ -79,7 +79,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class TD3_Agent(Agent):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/basic_marl_class.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/basic_marl_class.rst.txt
index 42d168875..41a298c61 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/basic_marl_class.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/basic_marl_class.rst.txt
@@ -1,23 +1,23 @@
MARLAgent
=======================
-To create new MARL agents, you should build a class inherit from ``xuanpolicy.torch.agents.agents_marl.MARLAgent`` , ``xuanpolicy.tensorflow.agents.agents_marl.MARLAgent``, or ``xuanpolicy.mindspore.agents.agents_marl.MARLAgent``.
+To create new MARL agents, you should build a class inherit from ``xuance.torch.agents.agents_marl.MARLAgent`` , ``xuance.tensorflow.agents.agents_marl.MARLAgent``, or ``xuance.mindspore.agents.agents_marl.MARLAgent``.
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agents.agents_marl.MARLAgent(config, envs, policy, memory, learner, device, log_dir, model_dir)
+ xuance.torch.agents.agents_marl.MARLAgent(config, envs, policy, memory, learner, device, log_dir, model_dir)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param memory: Experice replay buffer.
- :type memory: xuanpolicy.common.memory_tools.Buffer
+ :type memory: xuance.common.memory_tools.Buffer
:param learner: The learner that updates parameters of policy.
- :type learner: xuanpolicy.torch.learner.LearnerMAS
+ :type learner: xuance.torch.learner.LearnerMAS
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
:param log_dir: The directory of log file, default is "./logs/".
@@ -25,14 +25,14 @@ To create new MARL agents, you should build a class inherit from ``xuanpolicy.to
:param model_dir: The directory of model file, default is "./models/".
:type model_dir: str
-.. py:function:: xuanpolicy.torch.agents.agents_marl.MARLAgent.save_model(model_name)
+.. py:function:: xuance.torch.agents.agents_marl.MARLAgent.save_model(model_name)
Save the model.
:param model_name: The model's name to be saved.
:type model_name: str
-.. py:function:: xuanpolicy.torch.agents.agents_marl.MARLAgent.load_model(path, seed)
+.. py:function:: xuance.torch.agents.agents_marl.MARLAgent.load_model(path, seed)
Load a model by specifying the ``path`` and ``seed`` .
@@ -41,14 +41,14 @@ To create new MARL agents, you should build a class inherit from ``xuanpolicy.to
:param seed: Select the seed that model was trained with if it exits.
:type seed: int
-.. py:function:: xuanpolicy.torch.agents.agents_marl.MARLAgent.act(**kwargs)
+.. py:function:: xuance.torch.agents.agents_marl.MARLAgent.act(**kwargs)
Get actions for executing according to the joint observations, global states, available actions, etc.
:param kwargs: Inputs informations.
:type observations: Dict
-.. py:function:: xuanpolicy.torch.agents.agents_marl.MARLAgent.train(**kwargs)
+.. py:function:: xuance.torch.agents.agents_marl.MARLAgent.train(**kwargs)
Train the multi-agent reinforcement learning models.
@@ -59,7 +59,7 @@ To create new MARL agents, you should build a class inherit from ``xuanpolicy.to
.. py:class::
- xuanpolicy.torch.agents.agents_marl.linear_decay_or_increase(start, end, step_length)
+ xuance.torch.agents.agents_marl.linear_decay_or_increase(start, end, step_length)
:param start: Start factor.
:type start: np.float
@@ -68,23 +68,23 @@ To create new MARL agents, you should build a class inherit from ``xuanpolicy.to
:param step_length: The number of steps the factor decays or increases.
:type step_length: int
-.. py:function:: xuanpolicy.torch.agents.agents_marl.linear_decay_or_increase.update()
+.. py:function:: xuance.torch.agents.agents_marl.linear_decay_or_increase.update()
Update the factor once.
.. py:class::
- xuanpolicy.torch.agents.agents_marl.RandomAgents(args, envs, device=None)
+ xuance.torch.agents.agents_marl.RandomAgents(args, envs, device=None)
:param args: Provides hyper parameters.
:type args: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agents.agents_marl.RandomAgents.act()
+ xuance.torch.agents.agents_marl.RandomAgents.act()
Provide random actions for RandomAgents.
@@ -99,18 +99,18 @@ To create new MARL agents, you should build a class inherit from ``xuanpolicy.to
**TensorFlow:**
.. py:class::
- xuanpolicy.tensorflow.agents.agents_marl.MARLAgent(config, envs, policy, memory, learner, device, log_dir, model_dir)
+ xuance.tensorflow.agents.agents_marl.MARLAgent(config, envs, policy, memory, learner, device, log_dir, model_dir)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param memory: Experice replay buffer.
- :type memory: xuanpolicy.common.memory_tools.Buffer
+ :type memory: xuance.common.memory_tools.Buffer
:param learner: The learner that updates parameters of policy.
- :type learner: xuanpolicy.tensorflow.learner.Learner
+ :type learner: xuance.tensorflow.learner.Learner
:param device: Choose CPU or GPU to train the model.
:type device: str
:param log_dir: The directory of log file, default is "./logs/".
@@ -126,16 +126,16 @@ To create new MARL agents, you should build a class inherit from ``xuanpolicy.to
**MindSpore:**
.. py:class::
- xuanpolicy.mindspore.agents.agents_marl.MARLAgent(envs, policy, memory, learner, device, log_dir, model_dir)
+ xuance.mindspore.agents.agents_marl.MARLAgent(envs, policy, memory, learner, device, log_dir, model_dir)
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param policy: The policy that provides actions and values.
:type policy: nn.Module
:param memory: Experice replay buffer.
- :type memory: xuanpolicy.common.memory_tools.Buffer
+ :type memory: xuance.common.memory_tools.Buffer
:param learner: The learner that updates parameters of policy.
- :type learner: xuanpolicy.mindspore.learner.Learner
+ :type learner: xuance.mindspore.learner.Learner
:param device: Choose CPU or GPU to train the model.
:type device: str
:param log_dir: The directory of log file, default is "./logs/".
@@ -158,7 +158,7 @@ Source Code
.. code-block:: python
import os.path
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class MARLAgents(object):
@@ -247,7 +247,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.tensorflow.agents import *
+ from xuance.tensorflow.agents import *
class MARLAgents(object):
def __init__(self,
@@ -355,7 +355,7 @@ Source Code
import mindspore as ms
import mindspore.ops as ops
from mindspore import Tensor
- from xuanpolicy.mindspore.agents import *
+ from xuance.mindspore.agents import *
class MARLAgents(object):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/coma.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/coma.rst.txt
index abd50ed63..0d85ac500 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/coma.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/coma.rst.txt
@@ -8,17 +8,17 @@ COMA_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.coma_agents.COMA_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.coma_agents.COMA_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.coma_agents.COMA_Agents.act(obs_n, *rnn_hidden, avail_actions=None, state=None, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.coma_agents.COMA_Agents.act(obs_n, *rnn_hidden, avail_actions=None, state=None, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -36,7 +36,7 @@ COMA_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.coma_agents.COMA_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.coma_agents.COMA_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -72,8 +72,8 @@ Source Code
.. code-block:: python
import torch
- from xuanpolicy.torch.agents import *
- from xuanpolicy.torch.agents.agents_marl import linear_decay_or_increase
+ from xuance.torch.agents import *
+ from xuance.torch.agents.agents_marl import linear_decay_or_increase
class COMA_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/dcg.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/dcg.rst.txt
index 3be9bf172..0d92d064c 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/dcg.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/dcg.rst.txt
@@ -8,17 +8,17 @@ DCG_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.dcg_agents.DCG_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.dcg_agents.DCG_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.dcg_agents.DCG_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.dcg_agents.DCG_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -34,7 +34,7 @@ DCG_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.dcg_agents.DCG_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.dcg_agents.DCG_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -70,7 +70,7 @@ Source Code
.. code-block:: python
import torch.nn
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class DCG_Agents(MARLAgents):
def __init__(self,
@@ -92,7 +92,7 @@ Source Code
else:
representation = REGISTRY_Representation[config.representation](*input_representation)
repre_state_dim = representation.output_shapes['state'][0]
- from xuanpolicy.torch.policies.coordination_graph import DCG_utility, DCG_payoff, Coordination_Graph
+ from xuance.torch.policies.coordination_graph import DCG_utility, DCG_payoff, Coordination_Graph
utility = DCG_utility(repre_state_dim, config.hidden_utility_dim, config.dim_act).to(device)
payoffs = DCG_payoff(repre_state_dim * 2, config.hidden_payoff_dim, config.dim_act, config).to(device)
dcgraph = Coordination_Graph(config.n_agents, config.graph_type)
@@ -133,7 +133,7 @@ Source Code
config.done_shape, envs.num_envs, config.buffer_size, config.batch_size)
memory = buffer(*input_buffer, max_episode_length=envs.max_episode_length, dim_act=config.dim_act)
- from xuanpolicy.torch.learners.multi_agent_rl.dcg_learner import DCG_Learner
+ from xuance.torch.learners.multi_agent_rl.dcg_learner import DCG_Learner
learner = DCG_Learner(config, policy, optimizer, scheduler,
config.device, config.model_dir, config.gamma,
config.sync_frequency)
diff --git a/docs/build/html/_sources/documents/api/agents/marl/iddpg.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/iddpg.rst.txt
index 3c41d821c..46b1b19fd 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/iddpg.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/iddpg.rst.txt
@@ -8,17 +8,17 @@ IDDPG_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.iddpg_agents.IDDPG_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.iddpg_agents.IDDPG_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.iddpg_agents.IDDPG_Agents.act(obs_n, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.iddpg_agents.IDDPG_Agents.act(obs_n, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -30,7 +30,7 @@ IDDPG_Agents
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.iddpg_agents.IDDPG_Agents.train(i_episode)
+ xuance.torch.agent.mutli_agent_rl.iddpg_agents.IDDPG_Agents.train(i_episode)
Train the multi-agent reinforcement learning model.
@@ -65,7 +65,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class IDDPG_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/ippo.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/ippo.rst.txt
index e3925988c..b363f077a 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/ippo.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/ippo.rst.txt
@@ -8,12 +8,12 @@ IPPO_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.ippo_agents.IPPO_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.ippo_agents.IPPO_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
diff --git a/docs/build/html/_sources/documents/api/agents/marl/iql.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/iql.rst.txt
index 1421187dd..d09dbf1f8 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/iql.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/iql.rst.txt
@@ -8,17 +8,17 @@ IQL_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.iql_agents.IQL_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.iql_agents.IQL_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.iql_agents.IQL_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.iql_agents.IQL_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -34,7 +34,7 @@ IQL_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.iql_agents.IQL_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.iql_agents.IQL_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -69,7 +69,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class IQL_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/isac.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/isac.rst.txt
index cda1ab151..d0ad06cc5 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/isac.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/isac.rst.txt
@@ -8,17 +8,17 @@ ISAC_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.isac_agents.ISAC_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.isac_agents.ISAC_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.isac_agents.ISAC_Agents.act(obs_n, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.isac_agents.ISAC_Agents.act(obs_n, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -30,7 +30,7 @@ ISAC_Agents
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.isac_agents.ISAC_Agents.train(i_episode)
+ xuance.torch.agent.mutli_agent_rl.isac_agents.ISAC_Agents.train(i_episode)
Train the multi-agent reinforcement learning model.
@@ -65,7 +65,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class ISAC_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/maddpg.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/maddpg.rst.txt
index 542ec46a9..a02c6603a 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/maddpg.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/maddpg.rst.txt
@@ -8,17 +8,17 @@ MADDPG_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.maddpg_agents.MADDPG_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.maddpg_agents.MADDPG_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.maddpg_agents.MADDPG_Agents.act(obs_n, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.maddpg_agents.MADDPG_Agents.act(obs_n, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -30,7 +30,7 @@ MADDPG_Agents
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.maddpg_agents.MADDPG_Agents.train(i_episode)
+ xuance.torch.agent.mutli_agent_rl.maddpg_agents.MADDPG_Agents.train(i_episode)
Train the multi-agent reinforcement learning model.
@@ -65,7 +65,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class MADDPG_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/mappo.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/mappo.rst.txt
index 50b7e8d0e..74ba1be34 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/mappo.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/mappo.rst.txt
@@ -8,17 +8,17 @@ MAPPO_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.mappo_agents.MAPPO_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.mappo_agents.MAPPO_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.mappo_agents.MAPPO_Agents.act(obs_n, *rnn_hidden, avail_actions=None, state=None, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.mappo_agents.MAPPO_Agents.act(obs_n, *rnn_hidden, avail_actions=None, state=None, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -36,7 +36,7 @@ MAPPO_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.mappo_agents.MAPPO_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.mappo_agents.MAPPO_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -72,7 +72,7 @@ Source Code
.. code-block:: python
import torch
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class MAPPO_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/masac.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/masac.rst.txt
index 71c52fdcd..5bf66d292 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/masac.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/masac.rst.txt
@@ -8,17 +8,17 @@ MASAC_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.masac_agents.MASAC_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.masac_agents.MASAC_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.masac_agents.MASAC_Agents.act(obs_n, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.masac_agents.MASAC_Agents.act(obs_n, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -30,7 +30,7 @@ MASAC_Agents
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.masac_agents.MASAC_Agents.train(i_episode)
+ xuance.torch.agent.mutli_agent_rl.masac_agents.MASAC_Agents.train(i_episode)
Train the multi-agent reinforcement learning model.
@@ -65,7 +65,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class MASAC_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/matd3.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/matd3.rst.txt
index 55cf30594..249ba12c1 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/matd3.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/matd3.rst.txt
@@ -8,17 +8,17 @@ MATD3_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.matd3_agents.MATD3_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.matd3_agents.MATD3_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.matd3_agents.MATD3_Agents.act(obs_n, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.matd3_agents.MATD3_Agents.act(obs_n, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -30,7 +30,7 @@ MATD3_Agents
:rtype: np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.matd3_agents.MATD3_Agents.train(i_episode)
+ xuance.torch.agent.mutli_agent_rl.matd3_agents.MATD3_Agents.train(i_episode)
Train the multi-agent reinforcement learning model.
@@ -65,7 +65,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class MATD3_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/mfq.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/mfq.rst.txt
index 96d2f725e..c549cb410 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/mfq.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/mfq.rst.txt
@@ -8,17 +8,17 @@ MFQ_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.mfq_agents.MFQ_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.mfq_agents.MFQ_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.mfq_agents.MFQ_Agents.act(obs_n, *rnn_hidden, act_mean=None, agent_mask=False, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.mfq_agents.MFQ_Agents.act(obs_n, *rnn_hidden, act_mean=None, agent_mask=False, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -36,7 +36,7 @@ MFQ_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray, np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.mfq_agents.MFQ_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.mfq_agents.MFQ_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -71,8 +71,8 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
- from xuanpolicy.torch.agents.agents_marl import linear_decay_or_increase
+ from xuance.torch.agents import *
+ from xuance.torch.agents.agents_marl import linear_decay_or_increase
class MFQ_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/qmix.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/qmix.rst.txt
index 08e6c2756..8e7b2bdde 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/qmix.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/qmix.rst.txt
@@ -8,17 +8,17 @@ QMIX_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.qmix_agents.QMIX_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.qmix_agents.QMIX_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.qmix_agents.QMIX_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.qmix_agents.QMIX_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -34,7 +34,7 @@ QMIX_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.qmix_agents.QMIX_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.qmix_agents.QMIX_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -69,7 +69,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class QMIX_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/qtran.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/qtran.rst.txt
index 16360f0fd..a2dff0798 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/qtran.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/qtran.rst.txt
@@ -8,17 +8,17 @@ QTRAN_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.qtran_agents.QTRAN_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.qtran_agents.QTRAN_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.qtran_agents.QTRAN_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.qtran_agents.QTRAN_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -53,8 +53,8 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
- from xuanpolicy.torch.agents.agents_marl import linear_decay_or_increase
+ from xuance.torch.agents import *
+ from xuance.torch.agents.agents_marl import linear_decay_or_increase
class QTRAN_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/vdn.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/vdn.rst.txt
index 0d28dcf1e..6764932a7 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/vdn.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/vdn.rst.txt
@@ -8,17 +8,17 @@ VDN_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.vdn_agents.VDN_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.vdn_agents.VDN_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.vdn_agents.VDN_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.vdn_agents.VDN_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -34,7 +34,7 @@ VDN_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.vdn_agents.VDN_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.vdn_agents.VDN_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -69,7 +69,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class VDN_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/agents/marl/wqmix.rst.txt b/docs/build/html/_sources/documents/api/agents/marl/wqmix.rst.txt
index e432b960a..a28eb5f11 100644
--- a/docs/build/html/_sources/documents/api/agents/marl/wqmix.rst.txt
+++ b/docs/build/html/_sources/documents/api/agents/marl/wqmix.rst.txt
@@ -8,17 +8,17 @@ WQMIX_Agents
**PyTorch:**
.. py:class::
- xuanpolicy.torch.agent.mutli_agent_rl.wqmix_agents.WQMIX_Agents(config, envs, device)
+ xuance.torch.agent.mutli_agent_rl.wqmix_agents.WQMIX_Agents(config, envs, device)
:param config: Provides hyper parameters.
:type config: Namespace
:param envs: The vectorized environments.
- :type envs: xuanpolicy.environments.vector_envs.vector_env.VecEnv
+ :type envs: xuance.environments.vector_envs.vector_env.VecEnv
:param device: Choose CPU or GPU to train the model.
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.wqmix_agents.WQMIX_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
+ xuance.torch.agent.mutli_agent_rl.wqmix_agents.WQMIX_Agents.act(obs_n, *rnn_hidden, avail_actions=None, test_mode=False)
Calculate joint actions for N agents according to the joint observations.
@@ -34,7 +34,7 @@ WQMIX_Agents
:rtype: tuple(numpy.ndarray, numpy.ndarray), np.ndarray
.. py:function::
- xuanpolicy.torch.agent.mutli_agent_rl.wqmix_agents.WQMIX_Agents.train(i_step)
+ xuance.torch.agent.mutli_agent_rl.wqmix_agents.WQMIX_Agents.train(i_step)
Train the multi-agent reinforcement learning model.
@@ -69,7 +69,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.agents import *
+ from xuance.torch.agents import *
class WQMIX_Agents(MARLAgents):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/api/configs.rst.txt b/docs/build/html/_sources/documents/api/configs.rst.txt
index 2e36005d9..8fb9e06f6 100644
--- a/docs/build/html/_sources/documents/api/configs.rst.txt
+++ b/docs/build/html/_sources/documents/api/configs.rst.txt
@@ -9,13 +9,13 @@ Configs
基础参数配置
--------------------------
-基础参数配置存于xuanpolicy/config/basic.yaml文件中,示例如下:
+基础参数配置存于xuance/config/basic.yaml文件中,示例如下:
.. code-block:: yaml
dl_toolbox: "torch" # Values: "torch", "mindspore", "tensorlayer"
- project_name: "XuanPolicy_Benchmark"
+ project_name: "XuanCe_Benchmark"
logger: "tensorboard" # Values: tensorboard, wandb.
wandb_user_name: "papers_liu"
@@ -42,7 +42,7 @@ Configs
算法参数配置
--------------------------
-以DQN算法在Atari环境中的参数配置为例,除了基础参数配置外,其算法参数配置存放于 xuanpolicy/configs/dqn/atari.yaml
+以DQN算法在Atari环境中的参数配置为例,除了基础参数配置外,其算法参数配置存放于 xuance/configs/dqn/atari.yaml
文件中,内容如下:
.. raw:: html
@@ -107,8 +107,8 @@ Configs
针对场景差异较大的环境,如 ``Box2D`` 环境中的 ``CarRacing-v2`` 和 ``LunarLander`` 场景,
前者的状态输入是96*96*3的RGB图像,后者则是一个8维向量。因此,针对这两个场景的DQN算法参数配置分别存于以下两个文件中:
- * xuanpolicy/configs/dqn/box2d/CarRacing-v2.yaml
- * xuanpolicy/configs/dqn/box2d/LunarLander-v2.yaml
+ * xuance/configs/dqn/box2d/CarRacing-v2.yaml
+ * xuance/configs/dqn/box2d/LunarLander-v2.yaml
.. raw:: html
@@ -121,7 +121,7 @@ Configs
.. code-block:: python
- import xuanpolicy as xp
+ import xuance as xp
runner = xp.get_runner(method='dqn',
env='classic_control',
env_id='CartPole-v1',
diff --git a/docs/build/html/_sources/documents/api/learners/learner.rst.txt b/docs/build/html/_sources/documents/api/learners/learner.rst.txt
index c21b03bc4..305d4b877 100644
--- a/docs/build/html/_sources/documents/api/learners/learner.rst.txt
+++ b/docs/build/html/_sources/documents/api/learners/learner.rst.txt
@@ -1,12 +1,12 @@
Learner
=======================
-To create new learner, you should build a class inherit from ``xuanpolicy.torch.learners.learner.Learner`` , ``xuanpolicy.tensorflow.learners.learner.Learner``, or ``xuanpolicy.mindspore.learners.learner.Learner``.
+To create new learner, you should build a class inherit from ``xuance.torch.learners.learner.Learner`` , ``xuance.tensorflow.learners.learner.Learner``, or ``xuance.mindspore.learners.learner.Learner``.
**PyTorch:**
.. py:class::
- xuanpolicy.torch.learners.learner.Learner(policy, optimizer, scheduler=None, device=None, model_dir="./")
+ xuance.torch.learners.learner.Learner(policy, optimizer, scheduler=None, device=None, model_dir="./")
The basic class of the learner.
@@ -21,14 +21,14 @@ To create new learner, you should build a class inherit from ``xuanpolicy.torch.
:param model_dir: The directory of model file, default is "./".
:type model_dir: str
-.. py:function:: xuanpolicy.torch.learners.learner.Learner.save_model(model_path)
+.. py:function:: xuance.torch.learners.learner.Learner.save_model(model_path)
Save the model.
:param model_path: The model's path.
:type model_path: str
-.. py:function:: xuanpolicy.torch.learners.learner.Learner.load_model(path, seed=1)
+.. py:function:: xuance.torch.learners.learner.Learner.load_model(path, seed=1)
Load a model by specifying the ``path`` and ``seed`` .
@@ -37,7 +37,7 @@ To create new learner, you should build a class inherit from ``xuanpolicy.torch.
:param seed: Select the seed that model was trained with if it exits.
:type seed: int
-.. py:function:: xuanpolicy.torch.learners.learner.Learner.update(*args)
+.. py:function:: xuance.torch.learners.learner.Learner.update(*args)
Update the policies with self.optimizer.
diff --git a/docs/build/html/_sources/documents/api/representations/cnn.rst.txt b/docs/build/html/_sources/documents/api/representations/cnn.rst.txt
index b75977345..cb51a35de 100644
--- a/docs/build/html/_sources/documents/api/representations/cnn.rst.txt
+++ b/docs/build/html/_sources/documents/api/representations/cnn.rst.txt
@@ -3,7 +3,7 @@ CNN-based
Convolutional Neural Networks (CNNs) are mainly used for processing image input data to extract feature vectors.
They usually take multi-channel image matrices as input and output multi-dimensional vectors.
-The CNN block is defined in `./xuanpolicy/torch/utils/layers.py`, `./xuanpolicy/tensorflow/utils/layers.py` and `./xuanpolicy/mindspore/utils/layers.py`.
+The CNN block is defined in `./xuance/torch/utils/layers.py`, `./xuance/tensorflow/utils/layers.py` and `./xuance/mindspore/utils/layers.py`.
To instantiate this class, you need to specify the input size (`input_shape`), the filtering method (`filter`), the kernel size (`kernel_size`), the stride (`stride`), the normalization method (`normalize`), the activation function (`activation`), and the initialization method (`initialize`).
@@ -17,7 +17,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
**PyTorch:**
.. py:class::
- xuanpolicy.torch.representations.cnn.Basic_CNN(input_shape, kernels, strides, filters, normalize=None, initialize=None, activation=None, device=None)
+ xuance.torch.representations.cnn.Basic_CNN(input_shape, kernels, strides, filters, normalize=None, initialize=None, activation=None, device=None)
:param input_shape: The shape of the inputs.
:type input_shape: Sequence of int
@@ -36,7 +36,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.representations.cnn.Basic_CNN._create_network()
+ xuance.torch.representations.cnn.Basic_CNN._create_network()
Create the convolutional neural netowrks.
@@ -44,7 +44,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:rtype: nn.Module
.. py:function::
- xuanpolicy.torch.representations.cnn.Basic_CNN.forward(observations)
+ xuance.torch.representations.cnn.Basic_CNN.forward(observations)
Calculate feature representation of the input observations.
@@ -54,7 +54,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:rtype: dict
.. py:class::
- xuanpolicy.torch.representations.cnn.AC_CNN_Atari(input_shape, kernels, strides, filters, normalize=None, initialize=None, activation=None, device=None)
+ xuance.torch.representations.cnn.AC_CNN_Atari(input_shape, kernels, strides, filters, normalize=None, initialize=None, activation=None, device=None)
:param input_shape: The shape of the inputs.
:type input_shape: Sequence of int
@@ -75,7 +75,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:type device: Sequence of int
.. py:function::
- xuanpolicy.torch.representations.cnn.AC_CNN_Atari._init_layer(layer, gain=numpy.sqrt(2), bias=0.0)
+ xuance.torch.representations.cnn.AC_CNN_Atari._init_layer(layer, gain=numpy.sqrt(2), bias=0.0)
Initialize the weights and biases of the model.
@@ -89,7 +89,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:rtype: nn.Module
.. py:function::
- xuanpolicy.torch.representations.cnn.AC_CNN_Atari._create_network()
+ xuance.torch.representations.cnn.AC_CNN_Atari._create_network()
Create the convolutional neural netowrks for actor-critic based algorithms and Atari tasks.
@@ -97,7 +97,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:rtype: nn.Module
.. py:function::
- xuanpolicy.torch.representations.cnn.AC_CNN_Atari.forward(observations)
+ xuance.torch.representations.cnn.AC_CNN_Atari.forward(observations)
Calculate feature representation of the input observations.
@@ -131,7 +131,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.representations import *
+ from xuance.torch.representations import *
# process the input observations with stacks of CNN layers
class Basic_CNN(nn.Module):
diff --git a/docs/build/html/_sources/documents/api/representations/mlp.rst.txt b/docs/build/html/_sources/documents/api/representations/mlp.rst.txt
index 91680b159..ecfa981c3 100644
--- a/docs/build/html/_sources/documents/api/representations/mlp.rst.txt
+++ b/docs/build/html/_sources/documents/api/representations/mlp.rst.txt
@@ -2,7 +2,7 @@ MLP-based
=====================================
The Multi-Layer Perceptron (MLP) is one of the simplest deep neural network models used for processing vector inputs.
-Users can instantiate the MLP module according to their own needs, which is defined in the `./xuanpolicy/torch/utils/layers.py`, `./xuanpolicy/tensorflow/utils/layers.py` and `./xuanpolicy/mindspore/utils/layers.py` files with the class name `mlp_block`.
+Users can instantiate the MLP module according to their own needs, which is defined in the `./xuance/torch/utils/layers.py`, `./xuance/tensorflow/utils/layers.py` and `./xuance/mindspore/utils/layers.py` files with the class name `mlp_block`.
To instantiate this class, you need to specify the input dimension (`input_dim`), output dimension (`output_dim`), normalization method (`normalize`), activation function choice (`activation`), and initialization method (`initialize`).
@@ -15,7 +15,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
**PyTorch:**
.. py:class::
- xuanpolicy.torch.representations.mlp.Basic_Identical(input_shape, device)
+ xuance.torch.representations.mlp.Basic_Identical(input_shape, device)
:param input_shape: The shape of the inputs.
:type input_shape: Sequence[int]
@@ -23,7 +23,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.representations.mlp.Basic_Identical.forward(observations)
+ xuance.torch.representations.mlp.Basic_Identical.forward(observations)
Calculate feature representation of the input observations.
@@ -33,7 +33,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:rtype: dict
.. py:class::
- xuanpolicy.torch.representations.mlp.Basic_MLP(input_shape, device)
+ xuance.torch.representations.mlp.Basic_MLP(input_shape, device)
:param input_shape: The shape of the inputs.
:type input_shape: Sequence[int]
@@ -41,7 +41,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:type device: str, int, torch.device
.. py:function::
- xuanpolicy.torch.representations.mlp.Basic_MLP._create_network()
+ xuance.torch.representations.mlp.Basic_MLP._create_network()
Create the multi-layer perceptron netowrks.
@@ -49,7 +49,7 @@ When implementing this class in PyTorch, you also need to specify the device typ
:rtype: nn.Module
.. py:function::
- xuanpolicy.torch.representations.mlp.Basic_MLP.forward(observations)
+ xuance.torch.representations.mlp.Basic_MLP.forward(observations)
Calculate feature representation of the input observations.
@@ -83,7 +83,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.representations import *
+ from xuance.torch.representations import *
# directly returns the original observation
class Basic_Identical(nn.Module):
diff --git a/docs/build/html/_sources/documents/api/representations/rnn.rst.txt b/docs/build/html/_sources/documents/api/representations/rnn.rst.txt
index 45f27e255..a46bd842b 100644
--- a/docs/build/html/_sources/documents/api/representations/rnn.rst.txt
+++ b/docs/build/html/_sources/documents/api/representations/rnn.rst.txt
@@ -3,7 +3,7 @@ RNN-based
Recurrent Neural Networks (RNNs) are mainly used for processing sequential signal information to extract feature vectors of the current sequence.
Depending on the usage scenario, this software provides two types of RNN modules: `gru_block` and `lstm_block`.
-Their definitions can be found in `./xuanpolicy/torch/utils/layers.py` , `./xuanpolicy/tensorflow/utils/layers.py` and `./xuanpolicy/mindspore/utils/layers.py` respectively.
+Their definitions can be found in `./xuance/torch/utils/layers.py` , `./xuance/tensorflow/utils/layers.py` and `./xuance/mindspore/utils/layers.py` respectively.
To instantiate these classes, you need to specify the input dimension (`input_dim`), output dimension (`output_dim`), pruning method (`dropout`), and initialization method (`initialize`).
@@ -16,7 +16,7 @@ Similarly, when implementing these classes in PyTorch, you also need to specify
**PyTorch:**
.. py:class::
- xuanpolicy.torch.representations.rnn.Basic_RNN(input_shape, hidden_sizes, normalize=None, initialize=None, activation=None, device=None, kwargs)
+ xuance.torch.representations.rnn.Basic_RNN(input_shape, hidden_sizes, normalize=None, initialize=None, activation=None, device=None, kwargs)
The ``hidden_sizes`` is a dict input, which contains "fc_hidden_sizes" and "fc_hidden_sizes".
The "fc_hidden_sizes" is the sizes of the fully connected layers before rnn layers.
@@ -43,7 +43,7 @@ Similarly, when implementing these classes in PyTorch, you also need to specify
:type rnn: str
.. py:function::
- xuanpolicy.torch.representations.rnn.Basic_RNN._create_network()
+ xuance.torch.representations.rnn.Basic_RNN._create_network()
Create the recurrent neural netowrks.
@@ -51,7 +51,7 @@ Similarly, when implementing these classes in PyTorch, you also need to specify
:rtype: nn.Module, nn.Module, int
.. py:function::
- xuanpolicy.torch.representations.rnn.Basic_RNN.forward(x, h, c=None)
+ xuance.torch.representations.rnn.Basic_RNN.forward(x, h, c=None)
Calculate feature representation of the inputs.
@@ -65,7 +65,7 @@ Similarly, when implementing these classes in PyTorch, you also need to specify
:rtype: dict
.. py:function::
- xuanpolicy.torch.representations.rnn.Basic_RNN.init_hidden(batch)
+ xuance.torch.representations.rnn.Basic_RNN.init_hidden(batch)
Initialize a batch of RNN hidden states.
@@ -75,7 +75,7 @@ Similarly, when implementing these classes in PyTorch, you also need to specify
:rtype: torch.Tensor
.. py:function::
- xuanpolicy.torch.representations.rnn.Basic_RNN.init_hidden_item(i, rnn_hidden)
+ xuance.torch.representations.rnn.Basic_RNN.init_hidden_item(i, rnn_hidden)
Initialize a slice of hidden states from the given RNN hidden states.
@@ -87,7 +87,7 @@ Similarly, when implementing these classes in PyTorch, you also need to specify
:rtype: torch.Tensor
.. py:function::
- xuanpolicy.torch.representations.rnn.Basic_RNN.get_hidden_item(i, rnn_hidden)
+ xuance.torch.representations.rnn.Basic_RNN.get_hidden_item(i, rnn_hidden)
Get a slice of hidden states from the given RNN hidden states.
@@ -123,7 +123,7 @@ Source Code
.. code-block:: python
- from xuanpolicy.torch.representations import *
+ from xuance.torch.representations import *
class Basic_RNN(nn.Module):
def __init__(self,
diff --git a/docs/build/html/_sources/documents/benchmark/mujoco.rst.txt b/docs/build/html/_sources/documents/benchmark/mujoco.rst.txt
index 0dc390440..2549e3c55 100644
--- a/docs/build/html/_sources/documents/benchmark/mujoco.rst.txt
+++ b/docs/build/html/_sources/documents/benchmark/mujoco.rst.txt
@@ -22,7 +22,7 @@ Results
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| Task | | Ant | HalfCheetah | Hopper | Walker2d | Swimmer | Humanoid | Reacher | Ipendulum | IDPendulum |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
-| DDPG | XuanPolicy | 1472.8 | 10093 | 3434.9 | 2443.7 | 67.7 | 99 | -4.05 | 1000 | 9359.8 |
+| DDPG | XuanCe | 1472.8 | 10093 | 3434.9 | 2443.7 | 67.7 | 99 | -4.05 | 1000 | 9359.8 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| | Tianshou | 990.4 | 11718.7 | 2197 | 1400.6 | 144.1 | 177.3 | -3.3 | 1000 | 8364.3 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
@@ -30,7 +30,7 @@ Results
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| | SpinningUp | 840 | 11000 | 1800 | 1950 | 137 | / | / | / | / |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
-| TD3 | XuanPolicy | 4822.9 | 10718.1 | 3492.4 | 4307.9 | 59.9 | 547.88 | -4.07 | 1000 | 9358.9 |
+| TD3 | XuanCe | 4822.9 | 10718.1 | 3492.4 | 4307.9 | 59.9 | 547.88 | -4.07 | 1000 | 9358.9 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| | Tianshou | 5116.4 | 10201.2 | 3472.2 | 3982.4 | 104.2 | 5189.5 | -2.7 | 1000 | 9349.2 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
@@ -38,13 +38,13 @@ Results
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| | SpinningUp | 3800 | 9750 | 2860 | 4000 | 78 | / | / | / | / |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
-| A2C | XuanPolicy | 1420.4 | 2674.5 | 825.9 | 970.6 | 51.4 | 240.9 | -11.7 | 1000 | 9357.8 |
+| A2C | XuanCe | 1420.4 | 2674.5 | 825.9 | 970.6 | 51.4 | 240.9 | -11.7 | 1000 | 9357.8 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| | Tianshou | 3485.4 | 1829.9 | 1253.2 | 1091.6 | 36.6 | 1726 | -6.7 | 1000 | 9257.7 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| | Published | / | 1000 | 900 | 850 | 31 | / | -24 | 1000 | 8100 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
-| PPO | XuanPolicy | 2810.7 | 4628.4 | 3450.1 | 4318.6 | 108.9 | 705.5 | -8.1 | 1000 | 9359.1 |
+| PPO | XuanCe | 2810.7 | 4628.4 | 3450.1 | 4318.6 | 108.9 | 705.5 | -8.1 | 1000 | 9359.1 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
| | Tianshou | 3258.4 | 5783.9 | 2609.3 | 3588.5 | 66.7 | 787.1 | -4.1 | 1000 | 9231.3 |
+------+------------+--------+-------------+--------+----------+---------+----------+---------+-----------+------------+
diff --git a/docs/build/html/_sources/documents/usage/basic_usage.rst.txt b/docs/build/html/_sources/documents/usage/basic_usage.rst.txt
index 9428055da..63d33c1ec 100644
--- a/docs/build/html/_sources/documents/usage/basic_usage.rst.txt
+++ b/docs/build/html/_sources/documents/usage/basic_usage.rst.txt
@@ -8,13 +8,13 @@ Quick Start
Run a DRL example
-----------------------
-In XuanPolicy, it is easy to build a DRL agent. First you need to create a *runner*
+In XuanCe, it is easy to build a DRL agent. First you need to create a *runner*
and specify the ``agent_name``, ``env_name``, then a runner that contains agent, policy, and envs, etc., will be built.
Finally, execute ``runner.run`` and the agent's model is training.
.. code-block:: python
- import xuanpolicy as xp
+ import xuance as xp
runner = xp.get_runner(method='dqn',
env='classic_control',
env_id='CartPole-v1',
@@ -30,12 +30,12 @@ After training the agent, you can test and view the model by the following codes
Run an MARL example
-----------------------
-XuanPolicy support MARL algorithms with both cooperative and competitive tasks.
+XuanCe support MARL algorithms with both cooperative and competitive tasks.
Similaly, you can start by:
.. code-block:: python
- import xuanpolicy as xp
+ import xuance as xp
runner = xp.get_runner(method='maddpg',
env='mpe',
env_id='simple_spread_v3',
@@ -46,7 +46,7 @@ For competitve tasks in which agents can be divided to two or more sides, you ca
.. code-block:: python
- import xuanpolicy as xp
+ import xuance as xp
runner = xp.get_runner(method=["maddpg", "iddpg"],
env='mpe',
env_id='simple_push_v3',
@@ -59,12 +59,12 @@ The "adversary"s are MADDPG agents, and the "agent"s are IDDPG agents.
Test
-----------------------
-After completing the algorithm training, XuanPolicy will save the model files and training log information in the designated directory.
+After completing the algorithm training, XuanCe will save the model files and training log information in the designated directory.
Users can specify "is_test=True" to perform testing.
.. code-block:: python
- import xuanpolicy as xp
+ import xuance as xp
runner = xp.get_runner(method='dqn',
env_name='classic_control',
env_id='CartPole-v1',
@@ -76,7 +76,7 @@ In the above code, "runner.benchmark()" can also be used instead of "runner.run(
Logger
-----------------------
-You can use the tensorboard or wandb to visualize the training process by specifying the "logger" parameter in the "xuanpolicy/configs/basic.yaml".
+You can use the tensorboard or wandb to visualize the training process by specifying the "logger" parameter in the "xuance/configs/basic.yaml".
.. code-block:: yaml
@@ -101,7 +101,7 @@ Taking the path "./logs/dqn/torch/CartPole-v0" as an example, users can visualiz
**2. W&B**
If you choose to use the wandb tool for training visualization,
-you can create an account according to the official W&B instructions and specify the username "wandb_user_name" in the "xuanpolicy/configs/basic.yaml" file.
+you can create an account according to the official W&B instructions and specify the username "wandb_user_name" in the "xuance/configs/basic.yaml" file.
For information on using W&B and its local deployment, you can refer to the following link:
diff --git a/docs/build/html/_sources/documents/usage/installation.rst.txt b/docs/build/html/_sources/documents/usage/installation.rst.txt
index f52911bbd..9d2b6d691 100644
--- a/docs/build/html/_sources/documents/usage/installation.rst.txt
+++ b/docs/build/html/_sources/documents/usage/installation.rst.txt
@@ -3,10 +3,10 @@ Installation
The library can be run at Linux, Windows, MacOS, and EulerOS, etc. It is easy to be installed.
-Before installing **XuanPolicy**, you should install Anaconda_ to prepare a python environment.
+Before installing **XuanCe**, you should install Anaconda_ to prepare a python environment.
-After that, open a terminal and install **XuanPolicy** by the following steps.
-You can choose two ways to install XuanPolicy.
+After that, open a terminal and install **XuanCe** by the following steps.
+You can choose two ways to install XuanCe.
.. raw:: html
@@ -31,38 +31,38 @@ Install via PyPI
.. code-block:: bash
- pip install xuanpolicy
+ pip install xuance
This command does not include the dependencies of deep learning toolboxes.
-You can also install the **XuanPolicy** with PyTorch_, TensorFlow2_, MindSpore_, or all of them.
+You can also install the **XuanCe** with PyTorch_, TensorFlow2_, MindSpore_, or all of them.
.. code-block:: bash
- pip install xuanpolicy[torch]
+ pip install xuance[torch]
or
.. code-block:: bash
- pip install xuanpolicy[tensorflow]
+ pip install xuance[tensorflow]
or
.. code-block:: bash
- pip install xuanpolicy[mindspore]
+ pip install xuance[mindspore]
or
.. code-block:: bash
- pip install xuanpolicy[all]
+ pip install xuance[all]
Install from GitHub repository
---------------------------------------------
-Alternatively, you can install XuanPolicy from its GitHub repository.
+Alternatively, you can install XuanCe from its GitHub repository.
.. note::
@@ -80,19 +80,19 @@ Alternatively, you can install XuanPolicy from its GitHub repository.
conda activate xpolicy
-**Step 3**: Download the source code of XuanPolicy from GitHub.
+**Step 3**: Download the source code of XuanCe from GitHub.
.. code-block:: bash
- git clone https://github.com/agi-brain/xuanpolicy.git
+ git clone https://github.com/agi-brain/xuance.git
-**Step 4**: Change directory to the xuanpolicy.
+**Step 4**: Change directory to the xuance.
.. code-block:: bash
- cd xuanpolicy
+ cd xuance
-**Step 5**: Install xuanpolicy.
+**Step 5**: Install xuance.
.. code-block:: bash
@@ -114,13 +114,13 @@ Alternatively, you can install XuanPolicy from its GitHub repository.
Testing whether the installation was successful
--------------------------------------------------------------------
-After installing XuanPolicy, you can enter the Python runtime environment by typing "python" in the terminal.
-Then, test the installation of xuanpolicy by typing:
+After installing XuanCe, you can enter the Python runtime environment by typing "python" in the terminal.
+Then, test the installation of xuance by typing:
.. code-block:: python
- import xuanpolicy
+ import xuance
-If no error or warning messages are displayed, it indicates that XuanPolicy has been successfully installed.
+If no error or warning messages are displayed, it indicates that XuanCe has been successfully installed.
You can proceed to the next step and start using it.
diff --git a/docs/build/html/_sources/documents/usage/professional_usage.rst.txt b/docs/build/html/_sources/documents/usage/professional_usage.rst.txt
index 89595ee07..63bc42854 100644
--- a/docs/build/html/_sources/documents/usage/professional_usage.rst.txt
+++ b/docs/build/html/_sources/documents/usage/professional_usage.rst.txt
@@ -2,7 +2,7 @@ Professional Usage
================================
The previous page demonstrated how to directly run an algorithm by calling the runner.
-In order to help users better understand the internal implementation process of "XuanPolicy",
+In order to help users better understand the internal implementation process of "XuanCe",
and facilitate further algorithm development and implementation of their own reinforcement learning tasks,
this section will take the PPO algorithm training on the MuJoCo environment task as an example,
and provide a detailed introduction on how to call the API from the bottom level to implement reinforcement learning model training.
@@ -20,7 +20,7 @@ Here we show a config file named "mujoco.yaml" for MuJoCo environment in gym.
.. code-block:: yaml
dl_toolbox: "torch" # The deep learning toolbox. Choices: "torch", "mindspore", "tensorlayer"
- project_name: "XuanPolicy_Benchmark"
+ project_name: "XuanCe_Benchmark"
logger: "tensorboard" # Choices: tensorboard, wandb.
wandb_user_name: "your_user_name"
render: False
@@ -89,7 +89,7 @@ which uses the Python package `argparser` to read the command line instructions
import argparser
def parse_args():
- parser = argparse.ArgumentParser("Example of XuanPolicy.")
+ parser = argparse.ArgumentParser("Example of XuanCe.")
parser.add_argument("--method", type=str, default="ppo")
parser.add_argument("--env", type=str, default="mujoco")
parser.add_argument("--env-id", type=str, default="InvertedPendulum-v4")
@@ -107,7 +107,7 @@ and then the configuration parameters from Step 1 are obtained.
.. code-block:: python
- from xuanpolicy import get_arguments
+ from xuance import get_arguments
if __name__ == "__main__":
parser = parse_args()
@@ -118,8 +118,8 @@ and then the configuration parameters from Step 1 are obtained.
parser_args=parser)
run(args)
-In this step, the ``get_arguments()`` function from "XuanPolicy" is called.
-In this function, it first searches for readable parameters based on the combination of the ``env`` and ``env_id`` variables in the `xuanpolicy/configs/` directory.
+In this step, the ``get_arguments()`` function from "XuanCe" is called.
+In this function, it first searches for readable parameters based on the combination of the ``env`` and ``env_id`` variables in the `xuance/configs/` directory.
If default parameters already exist, they are all read. Then, the function continues to index the configuration file from Step 1 using the ``config.path`` path and reads all the parameters from the .yaml file.
Finally, it reads all the parameters from the ``parser``.
@@ -145,10 +145,10 @@ Here is an example definition of the run() function with comments:
import numpy as np
import torch.optim
- from xuanpolicy.common import space2shape
- from xuanpolicy.environment import make_envs
- from xuanpolicy.torch.utils.operations import set_seed
- from xuanpolicy.torch.utils import ActivationFunctions
+ from xuance.common import space2shape
+ from xuance.environment import make_envs
+ from xuance.torch.utils.operations import set_seed
+ from xuance.torch.utils import ActivationFunctions
def run(args):
agent_name = args.agent # get the name of Agent.
@@ -165,7 +165,7 @@ Here is an example definition of the run() function with comments:
n_envs = envs.num_envs # get the number of vectorized environments.
# prepare representation
- from xuanpolicy.torch.representations import Basic_MLP
+ from xuance.torch.representations import Basic_MLP
representation = Basic_MLP(input_shape=space2shape(args.observation_space),
hidden_sizes=args.representation_hidden_size,
normalize=None,
@@ -174,7 +174,7 @@ Here is an example definition of the run() function with comments:
device=args.device) # create representation
# prepare policy
- from xuanpolicy.torch.policies import Gaussian_AC_Policy
+ from xuance.torch.policies import Gaussian_AC_Policy
policy = Gaussian_AC_Policy(action_space=args.action_space,
representation=representation,
actor_hidden_size=args.actor_hidden_size,
@@ -185,7 +185,7 @@ Here is an example definition of the run() function with comments:
device=args.device) # create Gaussian policy
# prepare agent
- from xuanpolicy.torch.agents import PPOCLIP_Agent, get_total_iters
+ from xuance.torch.agents import PPOCLIP_Agent, get_total_iters
optimizer = torch.optim.Adam(policy.parameters(), args.learning_rate, eps=1e-5) # create optimizer
lr_scheduler = torch.optim.lr_scheduler.LinearLR(optimizer, start_factor=1.0, end_factor=0.0,
total_iters=get_total_iters(agent_name, args)) # for learning rate decay
@@ -256,6 +256,6 @@ After finishing the above three steps, you can run the `python_mujoco.py` file i
The source code of this example can be visited at the following link:
-`https://github.com/agi-brain/xuanpolicy/examples/ppo/ppo_mujoco.py
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent
xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent._action()
xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent.train()
xuanpolicy.torch.agent.policy_gradient.a2c_agent.A2C_Agent.test()
xuance.torch.agent.policy_gradient.a2c_agent.A2C_Agent
PyTorch:
config (Namespace) – Provides hyper parameters.
envs (xuanpolicy.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
envs (xuance.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
policy (nn.Module) – The policy that provides actions and values.
optimizer (torch.optim.Optimizer) – The optimizer that updates the parameters.
scheduler (torch.optim.lr_scheduler._LRScheduler) – Implement the learning rate decay.
Calculate actions according to the observations.
Train the A2C agent.
Test the trained model.
import numpy as np
-from xuanpolicy.torch.agents import *
+from xuance.torch.agents import *
class A2C_Agent(Agent):
@@ -414,7 +414,7 @@ Source Code
- © Copyright 2023, XuanPolicy contributors.
+ © Copyright 2023, XuanCe contributors.
xuanpolicy.torch.agents.agent.Agent
xuanpolicy.torch.agents.agent.Agent.save_model()
xuanpolicy.torch.agents.agent.Agent.load_model()
xuanpolicy.torch.agents.agent.Agent.log_infos()
xuanpolicy.torch.agents.agent.Agent.log_videos()
xuanpolicy.torch.agents.agent.Agent._process_observation()
xuanpolicy.torch.agents.agent.Agent._process_reward()
xuanpolicy.torch.agents.agent.Agent._action()
xuanpolicy.torch.agents.agent.Agent.train()
xuanpolicy.torch.agents.agent.Agent.test()
xuanpolicy.torch.agents.agent.Agent.finish()
xuance.torch.agents.agent.Agent
xuance.torch.agents.agent.Agent.save_model()
xuance.torch.agents.agent.Agent.load_model()
xuance.torch.agents.agent.Agent.log_infos()
xuance.torch.agents.agent.Agent.log_videos()
xuance.torch.agents.agent.Agent._process_observation()
xuance.torch.agents.agent.Agent._process_reward()
xuance.torch.agents.agent.Agent._action()
xuance.torch.agents.agent.Agent.train()
xuance.torch.agents.agent.Agent.test()
xuance.torch.agents.agent.Agent.finish()
xuanpolicy.tensorflowtensorflow.agent.agent.Agent
xuanpolicy.mindsporetensorflow.agent.agent.Agent
xuance.tensorflowtensorflow.agent.agent.Agent
xuance.mindsporetensorflow.agent.agent.Agent
To create a new Agent, you should build a class inherit from xuanpolicy.torch.agents.agent.Agent
, xuanpolicy.tensorflow.agents.agent.Agent
, or xuanpolicy.mindspore.agents.agent.Agent
.
To create a new Agent, you should build a class inherit from xuance.torch.agents.agent.Agent
, xuance.tensorflow.agents.agent.Agent
, or xuance.mindspore.agents.agent.Agent
.
PyTorch:
config (Namespace) – Provides hyper parameters.
envs (xuanpolicy.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
envs (xuance.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
policy (nn.Module) – The policy that provides actions and values.
memory (xuanpolicy.common.memory_tools.Buffer) – Experice replay buffer.
learner (xuanpolicy.torch.learner.Learner) – The learner that updates parameters of policy.
memory (xuance.common.memory_tools.Buffer) – Experice replay buffer.
learner (xuance.torch.learner.Learner) – The learner that updates parameters of policy.
device (str, int, torch.device) – Choose CPU or GPU to train the model.
log_dir (str) – The directory of log file, default is “./logs/”.
model_dir (str) – The directory of model file, default is “./models/”.
Save the model.
Load a model by specifying the path
and seed
.
Visualize the training information via wandb or tensorboard.
Visualize the interaction between agent and environment by uploading the videos with wandb or tensorboard.
Normalize the original observations.
Normalize the original rewards.
Get actions for executing according to the observations.
Train the agents with steps
steps.
Test the agents.
Finish the wandb or tensorboard.
TensorFlow:
config (Namespace) – Provides hyper parameters.
envs (xuanpolicy.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
envs (xuance.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
policy (nn.Module) – The policy that provides actions and values.
memory (xuanpolicy.common.memory_tools.Buffer) – Experice replay buffer.
learner (xuanpolicy.tensorflow.learner.Learner) – The learner that updates parameters of policy.
memory (xuance.common.memory_tools.Buffer) – Experice replay buffer.
learner (xuance.tensorflow.learner.Learner) – The learner that updates parameters of policy.
device (str) – Choose CPU or GPU to train the model.
log_dir (str) – The directory of log file, default is “./logs/”.
model_dir (str) – The directory of model file, default is “./models/”.
MindSpore:
envs (xuanpolicy.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
envs (xuance.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
policy (nn.Module) – The policy that provides actions and values.
memory (xuanpolicy.common.memory_tools.Buffer) – Experice replay buffer.
learner (xuanpolicy.mindspore.learner.Learner) – The learner that updates parameters of policy.
memory (xuance.common.memory_tools.Buffer) – Experice replay buffer.
learner (xuance.mindspore.learner.Learner) – The learner that updates parameters of policy.
device (str) – Choose CPU or GPU to train the model.
log_dir (str) – The directory of log file, default is “./logs/”.
model_dir (str) – The directory of model file, default is “./models/”.
© Copyright 2023, XuanPolicy contributors.
+© Copyright 2023, XuanCe contributors.
xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent
xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent._action()
xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent.train()
xuanpolicy.torch.agent.qlearning_family.c51_agent.C51_Agent.test()
xuance.torch.agent.qlearning_family.c51_agent.C51_Agent
PyTorch:
config (Namespace) – Provides hyper parameters.
envs (xuanpolicy.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
envs (xuance.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
policy (nn.Module) – The policy that provides actions and values.
optimizer (torch.optim.Optimizer) – The optimizer that updates the parameters.
scheduler (torch.optim.lr_scheduler._LRScheduler) – Implement the learning rate decay.
Calculate actions according to the observations.
Train the C51DQN agent.
Test the trained model.
from xuanpolicy.torch.agents import *
+from xuance.torch.agents import *
class C51_Agent(Agent):
def __init__(self,
@@ -396,7 +396,7 @@ Source Code
- © Copyright 2023, XuanPolicy contributors.
+ © Copyright 2023, XuanCe contributors.
Built with Sphinx using a
diff --git a/docs/build/html/documents/api/agents/drl/ddpg.html b/docs/build/html/documents/api/agents/drl/ddpg.html
index 7b5dc20c7..ac58a5678 100644
--- a/docs/build/html/documents/api/agents/drl/ddpg.html
+++ b/docs/build/html/documents/api/agents/drl/ddpg.html
@@ -4,7 +4,7 @@
- DDPG_Agent — XuanPolicy v0.1.11 documentation
+ DDPG_Agent — XuanCe v0.1.11 documentation
@@ -40,7 +40,7 @@
- XuanPolicy
+ XuanCe
@@ -80,10 +80,10 @@
- SAC_Agent
- SACDIS_Agent
- DDPG_Agent
-xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent
-xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent._action()
-xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.train()
-xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.test()
+xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent
- Source Code
@@ -138,7 +138,7 @@
@@ -161,13 +161,13 @@
DDPG_Agent¶
PyTorch:
--
-class xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent(config, envs, policy, optimizer, scheduler, device)¶
+-
+class xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent(config, envs, policy, optimizer, scheduler, device)¶
- Parameters:
config (Namespace) – Provides hyper parameters.
-envs (xuanpolicy.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
+envs (xuance.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
policy (nn.Module) – The policy that provides actions and values.
optimizer (list[torch.optim.Optimizer]) – The optimizers of actor and critic that update the parameters.
scheduler (torch.optim.lr_scheduler._LRScheduler) – Implement the learning rate decay.
@@ -178,8 +178,8 @@ DDPG_Agent
--
-xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent._action(obs, noise_scale)¶
+-
+xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent._action(obs, noise_scale)¶
Calculate actions according to the observations.
- Parameters:
@@ -198,8 +198,8 @@ DDPG_Agent
--
-xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.train(train_steps)¶
+-
+xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.train(train_steps)¶
Train the DDPG agent.
- Parameters:
@@ -209,8 +209,8 @@ DDPG_Agent
--
-xuanpolicy.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.test(env_fn, test_episodes)¶
+-
+xuance.torch.agent.policy_gradient.ddpg_agent.DDPG_Agent.test(env_fn, test_episodes)¶
Test the trained model.
- Parameters:
@@ -233,7 +233,7 @@ DDPG_Agent
Source Code¶
-from xuanpolicy.torch.agents import *
+from xuance.torch.agents import *
class DDPG_Agent(Agent):
def __init__(self,
@@ -389,7 +389,7 @@ Source Code
- © Copyright 2023, XuanPolicy contributors.
+ © Copyright 2023, XuanCe contributors.
Built with Sphinx using a
diff --git a/docs/build/html/documents/api/agents/drl/ddqn.html b/docs/build/html/documents/api/agents/drl/ddqn.html
index c2fc239e3..daddd1c9b 100644
--- a/docs/build/html/documents/api/agents/drl/ddqn.html
+++ b/docs/build/html/documents/api/agents/drl/ddqn.html
@@ -4,7 +4,7 @@
- DDQN_Agent — XuanPolicy v0.1.11 documentation
+ DDQN_Agent — XuanCe v0.1.11 documentation
@@ -40,7 +40,7 @@
- XuanPolicy
+ XuanCe
@@ -65,10 +65,10 @@
- DQN_Agent
- C51_Agent
- DDQN_Agent
-xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent
-xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent._action()
-xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.train()
-xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.test()
+xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent
- Source Code
@@ -138,7 +138,7 @@
@@ -162,13 +162,13 @@ DDQN_Agent
--
-class xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent(config, envs, policy, optimizer, scheduler, device)¶
+-
+class xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent(config, envs, policy, optimizer, scheduler, device)¶
- Parameters:
config (Namespace) – Provides hyper parameters.
-envs (xuanpolicy.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
+envs (xuance.environments.vector_envs.vector_env.VecEnv) – The vectorized environments.
policy (nn.Module) – The policy that provides actions and values.
optimizer (torch.optim.Optimizer) – The optimizer that updates the parameters.
scheduler (torch.optim.lr_scheduler._LRScheduler) – Implement the learning rate decay.
@@ -179,8 +179,8 @@ DDQN_Agent
--
-xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent._action(obs, egreedy)¶
+-
+xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent._action(obs, egreedy)¶
Calculate actions according to the observations.
- Parameters:
@@ -199,8 +199,8 @@ DDQN_Agent
--
-xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.train(train_steps)¶
+-
+xuance.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.train(train_steps)¶
Train the Double DQN agent.
- Parameters:
@@ -210,8 +210,8 @@ DDQN_Agent
--
-xuanpolicy.torch.agent.qlearning_family.ddqn_agent.DDQN_Agent.test(