Stickbug¶
The stickbug environment is a 3D kinematic simulation of a stickbug robot. It lets you both move the robot base and arms with options for position and velocity control. It additionally provides support for custom observation and pollination models. For more information, see the robot body, orchard, observation, and pollination sections.
Be aware, to run the Stickbug environment, you will need to supply a set of parameters for each of the above. For a sample please check the Github file sb_params.json
Additionally be mindful that the observations will pass back whether a flower was pollinated. In the future this may be made partially observable.
Contents¶
StickbugEnv¶
This module contains the StickbugEnv for simulating stickbug
- class irl_gym.envs.stickbug.StickbugEnv(*, seed: Optional[int] = None, params: Optional[dict] = None)¶
Bases:
EnvEnvironment for modelling stickbug.
Due to the nature of action and observation space, env_checker has been disabled for this environment.
For more information see gym.Env docs
States (dict)
“base”: {“pose” : [x, y, yaw], “velocity” : [v_x, v_y, v_yaw]}
“arms”: {“<side><rel_pos>” : {“position”:[z, th1, th2, cam_yaw, cam_pitch], “velocity”: …, “bounds”: …}, …}
Observations
Agent position is fully observable {“base”: {}, “arms”: {}} Flower positions in Observation {“position” : [x, y, z], “orientation” : [x, y, z]} Pollinated flowers {“<side><rel_pos>” : {“position” : [x, y, z], “orientation” : [x, y, z]}}
(Should this always return or only after a successful attempt (curretnly this one)? Arguments for both…maybe add a flag)
In the future maybe observe rows/plants too?
Actions
{“base”: {“mode: “position”/”velocity”, “command”: [x, y, yaw]}, “arms”: {“<side><rel_pos>”:
- {
“mode”: “position”/”velocity”, “is_joint”: True/False, “command”: [x, y, z, th1, th2, cam_yaw, cam_pitch], “pollinate”: True/False, “is_relative”: True/False}}
“dt”: time step} }
Transition Probabilities
Motion: - Base: 100 % probability of moving in commanded direction - Arms 100 % probability of moving in commanded direction unless boundary, in which case 0 % probability of moving in commanded direction - Flower Pollination: Specified by user with a pollination class
Reward
\(R_{min}\), cost of a single timestep (add to all time steps regardless of pollination)
\(R_{max}\), reward for pollination all flowers
\(R_{max}/N\), reward for pollinating a single flower, where N is the number of flowers
Input
- Parameters:
seed – (int) RNG seed, default: None
Remaining parameters are passed as arguments through the
paramsdict. The corresponding keys are as follows:- Parameters:
base – (dict) Parameters for the base of the stickbug
support – (dict) Parameters for the support and arms of the stickbug
orchard – (dict) Parameters for the orchard of flowers
rows – (dict) Parameters for the rows of flowers
plant – (dict) Parameters for the plants of flowers
observation – (dict) Parameters for the observation of the flowers
flowers – (dict) Parameters for the flowers in the environment
r_range – (tuple) min and max params of reward, default: (-0.01, 1)
t_max – (int) max time steps, default: 100
dt – (float) time step, default: 0.1
prefix – (string) where to save images, default: “<cwd>/plot”
render – (str) render mode, default: “plot”
render_bounds – (dict) bounds for rendering, default: {“x”: [-5, 5], “y”: [-5, 5], “z”: [0, 5]}
save_frames – (bool) save images for gif, default: False
save_gif – (bool) save gif, default: False
log_level – (str) Level of logging to use. For more info see logging levels, default: “WARNING”
- get_actions(s: dict)¶
Gets range of actions for a given state
- Parameters:
s – (State) state from which to get actions
- Returns:
(dict) Range of actions, neighbors
- get_fignum()¶
Gets figure number
- Returns:
(int) figure number
- img_2_gif()¶
Converts images to gif
- metadata: dict[str, Any] = {'render_fps': 5, 'render_modes': ['plot', 'print', 'none']}¶
- render()¶
Renders environment
Has 1 render modes:
plot uses matplotlib visualization
Visualization
- reset(*, seed: Optional[int] = None, options: dict = {})¶
Resets environment to initial state and sets RNG seed.
Deviates from Gym in that it is assumed you can reset RNG seed at will because why should it matter…
- Parameters:
seed – (int) RNG seed, default:, {}
options – (dict) params for reset, see initialization, default: None
- Returns:
(tuple) State Observation, Info
- reward(s: dict, a: Optional[int] = None, sp: Optional[dict] = None)¶
Gets rewards for \((s,a,s')\) transition
- Parameters:
s – (State) Initial state (unused in this environment)
a – (int) Action (unused in this environment), default: None
sp – (State) resultant state, default: None
- Returns:
(float) reward
- step(a: Optional[dict] = None)¶
Increments enviroment by one timestep
- Parameters:
a – (dict) action, default: None
- Returns:
(tuple) State, reward, is_done, is_truncated, info