-
Notifications
You must be signed in to change notification settings - Fork 0
bonetblai/gpt-rewards
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Distribution of RTDP-BEL for reward-based POMDPs based on the transformation to SSP described in Solving POMDPs: RTDP-Bel bs. Point-based Algorithms. Blai Bonet and Hector Geffner. Proc. of 21st Int. Joint Conf. on Artificial Intelligence (IJCAI). Pasadena, California. 2009. AAAI Press. Pages 1641-1646. Available at http://ldc.usb.ve/~bonet/reports/IJCAI09-rtdpbel.pdf The RTDP-BEL is implemented inside an extract of the GPT software. GPT is a shell-base system for solving POMDPs. Here, we only provide an example session on its usage. For more information, contact Blai Bonet <[email protected]>. REQUIREMENTS ------------ You will need a proper GNU readline library. The ones that come with Mac OSX are not good. In my mac, I use "brew install readline" to obtain a good library. INSTALLATION ------------ Unpack and perform 'make all'. By default, gpt is installed in the bin directory. (In fact, bin/gpt is a symbolic link to src/rtdp-bel/gpt). EXAMPLE ------- Suppose that you are at the top-level directory and the problem RockSample_7_8.POMDP is in the same directory. $ bin/gpt Welcome to GPT, Version 2.00-cassandra. gpt> set pims [8,2000,500] gpt> solve RockSample_7_8.POMDP - [some output] gpt> quit The set command is used to set the value of different parameters. Among the most important are: 'qlevels' that specify the discretization levels used for quantization (a higher qlevels means a finer discretization, default 15), and 'cutoff' that specify the max length of a trial (default 250). The pims parameter tells gpt how it must solve the POMDP and evaluate the calculated policy. In the example, the pims tell gpt that it should perform 16000 trials of RTDP-BEL, and that each 2000 trials, the current policy should be evaluated with 500 trials. Hence, such specification permits the generation of a quality profile that shows how the policy is improved as more RTDP-BEL trials are performed. The solve command receives two parameters: the path to a POMDP file (in Cassandra's format) and the name of a file to redirect the output. If the latter is '-', then the output is redirected to the standard output. For any questions or comments, please email me <[email protected]>. Blai Bonet
About
Automatically exported from code.google.com/p/gpt-rewards
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published