Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Legal Actions during Rollout #35

Open
LorenzoBonanni opened this issue Feb 2, 2023 · 4 comments
Open

Legal Actions during Rollout #35

LorenzoBonanni opened this issue Feb 2, 2023 · 4 comments

Comments

@LorenzoBonanni
Copy link

When doing Random Rollout the action is selected among all actions of the environment without considering only legal actions.

@zsunberg
Copy link
Member

zsunberg commented Feb 4, 2023

If you use FORollout it should use the state-based action space.

If you use PORollout, by default it uses the NothingUpdater which doesn't pass any information in the belief to limit the action space. You can use a different updater with PORollout if you want to still use PORollout and get only the legal actions based on the belief.

@LorenzoBonanni
Copy link
Author

LorenzoBonanni commented Feb 4, 2023

I've tried using PORollout with DiscreteUpdater but it gives me error because it misses the extract_belief function which is available only for NothingUpdater and PreviousObservationUpdater.

here is the code I used:

using RockSample
using POMDPs
using POMDPTools
using BasicPOMCP
using Random
using ParticleFilters

rocks = [[10, 1], [11, 6], [9, 6], [2, 5]]
const n_particle = 32768 # 2^15
rand_noise_generator_for_sim = MersenneTwister(2980164632)
rand_noise_generator_seed_for_planner = MersenneTwister(941564507)
env = RockSamplePOMDP{4}(
        map_size=(12, 12),
        rocks_positions=rocks,
        sensor_efficiency=20.0,
        discount_factor=0.95,
        good_rock_reward=10.0,
        bad_rock_penalty=-10.0,
        sensor_use_penalty=0.0,
        step_penalty=0.0
    )
pf = UnweightedParticleFilter(env, n_particle, rand_noise_generator_for_sim)

solver = POMCPSolver(
    estimate_value=PORollout(
        RandomSolver( 
            rand_noise_generator_for_sim
        ),
        DiscreteUpdater(env)
    ),
    max_depth=100,
    c=1.0,
    tree_queries=n_particle,
    rng=rand_noise_generator_seed_for_planner
)
policy = solve(solver, env)
ib = initialstate(env)
a, ai = action_info(policy, ib)

@zsunberg
Copy link
Member

zsunberg commented Feb 5, 2023

Got it. Is there a reason you can't use FORollout? It may be possible to use PORollout, but you will have to write some more code.

@LorenzoBonanni
Copy link
Author

No there isn't I was just playing around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants