Stickbug

The stickbug environment is a 3D kinematic simulation of a stickbug robot. It lets you both move the robot base and arms with options for position and velocity control. It additionally provides support for custom observation and pollination models. For more information, see the robot body, orchard, observation, and pollination sections.

Be aware, to run the Stickbug environment, you will need to supply a set of parameters for each of the above. For a sample please check the Github file sb_params.json

Additionally be mindful that the observations will pass back whether a flower was pollinated. In the future this may be made partially observable.

Contents

StickbugEnv

This module contains the StickbugEnv for simulating stickbug

class irl_gym.envs.stickbug.StickbugEnv(*, seed: Optional[int] = None, params: Optional[dict] = None)

Bases: Env

Environment for modelling stickbug.

Due to the nature of action and observation space, env_checker has been disabled for this environment.

For more information see gym.Env docs

States (dict)

  • “base”: {“pose” : [x, y, yaw], “velocity” : [v_x, v_y, v_yaw]}

  • “arms”: {“<side><rel_pos>” : {“position”:[z, th1, th2, cam_yaw, cam_pitch], “velocity”: …, “bounds”: …}, …}

Observations

Agent position is fully observable {“base”: {}, “arms”: {}} Flower positions in Observation {“position” : [x, y, z], “orientation” : [x, y, z]} Pollinated flowers {“<side><rel_pos>” : {“position” : [x, y, z], “orientation” : [x, y, z]}}

(Should this always return or only after a successful attempt (curretnly this one)? Arguments for both…maybe add a flag)

In the future maybe observe rows/plants too?

Actions

{“base”: {“mode: “position”/”velocity”, “command”: [x, y, yaw]}, “arms”: {“<side><rel_pos>”:

{

“mode”: “position”/”velocity”, “is_joint”: True/False, “command”: [x, y, z, th1, th2, cam_yaw, cam_pitch], “pollinate”: True/False, “is_relative”: True/False}}

“dt”: time step} }

Transition Probabilities

Motion: - Base: 100 % probability of moving in commanded direction - Arms 100 % probability of moving in commanded direction unless boundary, in which case 0 % probability of moving in commanded direction - Flower Pollination: Specified by user with a pollination class

Reward

  • \(R_{min}\), cost of a single timestep (add to all time steps regardless of pollination)

  • \(R_{max}\), reward for pollination all flowers

  • \(R_{max}/N\), reward for pollinating a single flower, where N is the number of flowers

Input

Parameters:

seed – (int) RNG seed, default: None

Remaining parameters are passed as arguments through the params dict. The corresponding keys are as follows:

Parameters:
  • base – (dict) Parameters for the base of the stickbug

  • support – (dict) Parameters for the support and arms of the stickbug

  • orchard – (dict) Parameters for the orchard of flowers

  • rows – (dict) Parameters for the rows of flowers

  • plant – (dict) Parameters for the plants of flowers

  • observation – (dict) Parameters for the observation of the flowers

  • flowers – (dict) Parameters for the flowers in the environment

  • r_range – (tuple) min and max params of reward, default: (-0.01, 1)

  • t_max – (int) max time steps, default: 100

  • dt – (float) time step, default: 0.1

  • prefix – (string) where to save images, default: “<cwd>/plot”

  • render – (str) render mode, default: “plot”

  • render_bounds – (dict) bounds for rendering, default: {“x”: [-5, 5], “y”: [-5, 5], “z”: [0, 5]}

  • save_frames – (bool) save images for gif, default: False

  • save_gif – (bool) save gif, default: False

  • log_level – (str) Level of logging to use. For more info see logging levels, default: “WARNING”

get_actions(s: dict)

Gets range of actions for a given state

Parameters:

s – (State) state from which to get actions

Returns:

(dict) Range of actions, neighbors

get_fignum()

Gets figure number

Returns:

(int) figure number

img_2_gif()

Converts images to gif

metadata: dict[str, Any] = {'render_fps': 5, 'render_modes': ['plot', 'print', 'none']}
render()

Renders environment

Has 1 render modes:

  • plot uses matplotlib visualization

Visualization

reset(*, seed: Optional[int] = None, options: dict = {})

Resets environment to initial state and sets RNG seed.

Deviates from Gym in that it is assumed you can reset RNG seed at will because why should it matter…

Parameters:
  • seed – (int) RNG seed, default:, {}

  • options – (dict) params for reset, see initialization, default: None

Returns:

(tuple) State Observation, Info

reward(s: dict, a: Optional[int] = None, sp: Optional[dict] = None)

Gets rewards for \((s,a,s')\) transition

Parameters:
  • s – (State) Initial state (unused in this environment)

  • a – (int) Action (unused in this environment), default: None

  • sp – (State) resultant state, default: None

Returns:

(float) reward

step(a: Optional[dict] = None)

Increments enviroment by one timestep

Parameters:

a – (dict) action, default: None

Returns:

(tuple) State, reward, is_done, is_truncated, info