Skip to content

Developer guide

BorjaFG edited this page Mar 4, 2019 · 18 revisions

Main apps

These are the main apps in this project:

  • Badger: C# GUI app that allows the user to design, run and analyze experiments.
  • Herd Agent: C# service/daemon process that is installed in the slave machines to receive job requests, run them and return the results.
  • RLSimion: C++ executable that does the actual simulation and Reinforcement Learning process. As input, it receives an XML file with the parameters of the experiment. As output, it generates log files and, if asked to, visualizes the experiment in real time.

The diagram below shows a simplified view of the experiment execution process:

Distributed execution diagram

Compiling the source code

[[here|Compiling-the-source-code] you have a guide to setup the development environment.

FAQ

RLSimion

How do I create a new simulation environment (World)?

You must implement a subclass from base class DynamicModel (worlds/world.h) that implements the following methods:

  • constructor(ConfigNode* pParameters): used to construct and initialize the world.
    • State variables are defined by calling addStateVariable(NAME,UNITS,MIN_VALUE,MAX_VALUE,CIRCULAR);, where NAME is the name of the state variable, UNITS is the units in which the value of the variable is expressed, the range of values is given by [MIN_VALUE,MAX_VALUE] (Note: use literal values so that the parser can parse them), and CIRCULAR is a boolean that indicates whether it's a circular type of variable, such as angles (false if not given).
    • Action variables are similarly defined calling addActionVariable(NAME,UNITS,MIN_VALUE,MAX_VALUE);
    • Parameter templates in parameters.h can be used so that they can be configured from Badger at design time.
    • To inform the code parser that this class implements a world, the following macro should be included: METADATA("World", WORLD-NAME);, where WORLD-NAME is the name given to the world (shown in the GUI).
  • void reset(State *s);: called at the beginning of an episode to initialize the world. State variables in s can be randomly initialized or reset to some initial state of the system.
  • void executeAction(State *s, const Action *a, double dt);: called each time-step to calculate the next state of the simulated environment. s is the current state and a the action currently being executed, and this method must update s with the resulting state, dt seconds after the current.

You also need to add the new subclass in the factory method DynamicWorld::getInstance (more about this here).

We recommend you take one of the existing worlds (i.e. worlds/underwater-vehicle.cpp) as a reference.

How do I create a new agent/controller (Simion)?

You must implement a subclass of Simion (simion.h) that implements the following abstract methods:

  • double selectAction(const State* s, Action* a);. This is called every time-step so that the agent selects an action to execute. s is the current state and the output action should be set to a (the agent may give value only to a subset of the action variables). This method should return the probability by which the action was selected (currently ignored).
  • double update(const State* s, const Action* a, const State* s_p, double r, double probability);. This method is called to let the agent learn from an experience tuple <s,a,s_p,r> (probability should be ignored). It might be called more than once per time-step if Experience-Replay is being used.

You also need to add the new subclass in the factory method Simion::getInstance (more about this here).

Linux/Windows compilation

Since there is a different Visual Studio project for each target platform, new source files should be added to both projects to get the same features in both platforms.

Clone this wiki locally