Skip to content
Published on

Robotics Development ROS 2 2026 — Nav2 / MoveIt / Gazebo / Isaac Sim / MuJoCo / LeRobot / GR00T Deep Dive

Authors

Prologue — The robotics "ChatGPT moment" turned out to be real

When ChatGPT shipped in November 2022, many robotics researchers asked the same question: "When does our ChatGPT moment arrive?" For robots to do in the physical world what LLMs did with text, four axes had to align — data, simulation, hardware, and models.

As of May 2026, that alignment is mostly complete.

  • Data: The Hugging Face LeRobot dataset hub opened in October 2024, and within 18 months thousands of robot datasets have been uploaded. The Open X-Embodiment consortium has gathered 22 robot platforms, 527 skills, and over 1 million trajectories.
  • Simulation: NVIDIA Isaac Sim 5.0 delivers GPU-accelerated physics and rendering on Omniverse. MuJoCo became fully open source after Google DeepMind's acquisition. Gazebo Harmonic took over from Classic.
  • Hardware: Humanoids like Tesla Optimus, Figure 02, 1X NEO Beta, Unitree G1, Apptronik Apollo, and Sanctuary Phoenix are entering mass production in the 10K~20K USD price band.
  • Models: Google RT-2 (2023), OpenVLA (2024), Physical Intelligence π0 (2024), NVIDIA GR00T N1 (2025) — VLA (Vision-Language-Action) has emerged as a new model category.

On top of all this sits ROS 2. ROS 1 effectively died when Noetic hit EOL in May 2025, and ROS 2 — Jazzy Jalisco (2024.05), Kilted Kaiju (2025.05), and Lyrical Luth (2026.05 imminent) — has settled into a yearly LTS cadence.

This article maps the entire 2026 robotics stack — ROS 2, Nav2, MoveIt 2, Gazebo, Isaac Sim, MuJoCo, LeRobot, GR00T, VLAs, Foxglove — in one breath.


1. The 2026 robotics landscape — humanoid boom meets LLM+robot

First the big picture. Four major currents are running through robotics in 2026, all at once.

1.1 Humanoid boom

If 2024~2025 was the "year of demos," 2026 is the "year of mass production."

  • Tesla Optimus — Beta launched late 2025, external sales targeted for Q3 2026. Runs on Tesla's own FSD chip.
  • Figure 02 — Partnership with OpenAI; 100 units deployed at BMW's Spartanburg plant.
  • 1X NEO Beta — Announced 2024, home shipments starting 2026. Price around 20K USD.
  • Boston Dynamics Atlas (Electric) — Hydraulic Atlas retired April 2024; electric Atlas debuted. Now owned by Hyundai, trained on Hyundai Motor Group's data and assembly lines.
  • Unitree G1 — Chinese Unitree, 16K USD, open API.
  • Apptronik Apollo — Deployed at Mercedes-Benz factories.
  • Sanctuary Phoenix — 7th-generation industrial humanoid from Canada.

1.2 Simulation-first learning (Sim-to-Real)

Humanoids' biggest problem is data scarcity. Humans see and hear petabytes of data over a lifetime; humanoids have a few terabytes at most. Enter simulation-first learning.

NVIDIA's slogan is "3 Computers" — a training computer (DGX), a simulation computer (Omniverse/Isaac Sim), and an execution computer (Jetson Thor). Run thousands of parallel environments per second in Isaac Sim, train policies via reinforcement learning, then transfer those policies to real robots (sim-to-real).

1.3 VLA — the robot GPT moment

The first signal came in July 2023 with Google DeepMind's RT-2 (Robotic Transformer 2). They took a VLM (PaLI-X) and fine-tuned it to emit action tokens. The result generalized natural-language commands like "pick up the extinct animal" to objects it had never seen.

Stanford released OpenVLA (7B) open source in June 2024. In October 2024, Physical Intelligence unveiled π0 (pi-zero) — a flow-matching-based 7B VLA — showing dexterity demos like folding shirts and washing dishes. NVIDIA's GR00T N1 (2025) is a 17B model dedicated to humanoids.

1.4 Data democratization — Hugging Face LeRobot

In October 2024, Hugging Face launched LeRobot, opening robotics ML from the "secret garden" of pre-ChatGPT NLP to GitHub-level accessibility. Pre-trained models, datasets, tutorials, and learning robot arm kits like SO-100/SO-101 (around 200 USD) all in one bundle.

2026 robotics 4-stack

[ Model     ]  RT-2 / OpenVLA / π0 / GR00T N1 / Helix
                       |
[ Learning  ]  LeRobot / Open X-Embodiment / RoboHive
                       |
[ Simulator ]  Isaac Sim 5.0 / MuJoCo / Gazebo Harmonic / Webots
                       |
[ Middleware]  ROS 2 Jazzy/Kilted/Lyrical + Nav2 + MoveIt 2
                       |
[ Hardware  ]  Optimus / Figure / 1X / Atlas / G1 / Apollo / Phoenix
                       |
[ Visualize ]  Foxglove Studio / RViz2 / PlotJuggler

We'll walk through each layer.


2. ROS 2 Jazzy → Kilted → Lyrical — one LTS per year

2.1 From ROS 1 to ROS 2 — why move

ROS 1 started at Willow Garage in 2007 and became the de facto standard in academia and research. But it had limits.

  • Single master (roscore) dependency — every node disconnects if the master dies.
  • Lack of real-time — robotics control needs millisecond determinism; ROS 1 didn't guarantee it.
  • No security — anyone could publish/subscribe to any topic.
  • Weak Windows / embedded support.

ROS 2 was redesigned from scratch to address these.

  • DDS (Data Distribution Service) as middleware — industrial communications standard, no master.
  • QoS (Quality of Service) policies — fine-grained control over Reliability, Durability, History.
  • DDS-Security — authentication, encryption, access control.
  • rclcpp/rclpy — C++17 and Python 3 based.

When ROS 1 Noetic hit EOL in May 2025, the ROS 1 era officially ended.

2.2 ROS 2 LTS cadence — May Day releases

ROS 2 releases every May (right after Ubuntu LTS). Even years are 5-year LTS; odd years are 2-year non-LTS.

CodenameReleasedEOLUbuntuNotes
Foxy Fitzroy2020.062023.0622.04First LTS
Galactic Geochelone2021.052022.1220.04Non-LTS
Humble Hawksbill2022.052027.0522.04LTS, most widely deployed
Iron Irwini2023.052024.1222.04Non-LTS
Jazzy Jalisco2024.052029.0524.04LTS
Kilted Kaiju2025.052026.1224.04Non-LTS
Lyrical Luth2026.052031.0526.04 (expected)LTS, imminent

As of May 2026, Jazzy (production) and Kilted (latest) are the main targets, with Lyrical Luth about to land. New projects should start on Jazzy.

2.3 ROS 2 core concepts — a 5-minute refresher

# Python rclpy node example
import rclpy
from rclpy.node import Node
from std_msgs.msg import String

class HelloPublisher(Node):
    def __init__(self):
        super().__init__('hello_publisher')
        self.publisher = self.create_publisher(String, 'greetings', 10)
        self.timer = self.create_timer(1.0, self.tick)

    def tick(self):
        msg = String()
        msg.data = 'Hello, ROS 2 Jazzy'
        self.publisher.publish(msg)
        self.get_logger().info(f'Published: {msg.data}')

def main():
    rclpy.init()
    node = HelloPublisher()
    rclpy.spin(node)
    node.destroy_node()
    rclpy.shutdown()

Four core communication patterns:

  1. Topic — pub/sub, async streams (sensor data, images).
  2. Service — synchronous RPC, short request/response (parameter queries).
  3. Action — long-running tasks with feedback and cancellation (navigation, pickup).
  4. Parameter — node settings, can be changed at runtime.

2.4 Build system — colcon

ROS 2 uses colcon for builds. A typical workspace pattern:

# Create workspace
mkdir -p ros2_ws/src
cd ros2_ws

# Clone packages into src
cd src
git clone https://github.com/ros-planning/navigation2.git -b jazzy
cd ..

# Install dependencies
rosdep install --from-paths src --ignore-src -r -y

# Build (parallel, symlink for fast dev iteration)
colcon build --symlink-install --parallel-workers $(nproc)

# Source the environment
source install/setup.bash

2.5 What changed — Jazzy → Kilted → Lyrical migration notes

Major changes since Jazzy:

  • Iceoryx2 middleware — Eclipse Iceoryx2 is stable. Microsecond IPC via shared memory.
  • Zenoh RMW — Eclipse Zenoh officially adopted as a ROS 2 middleware. Strong for multi-robot and cloud gateways.
  • rcl_logging improvements — Structured logging, JSON output.
  • Python 3.12 default.
  • Gazebo Sim integrationros_gz package hit 1.0.

3. Nav2 (Steve Macenski) — the navigation stack

3.1 What is Nav2

Nav2 (Navigation 2) is the ROS 2 redesign of ROS 1's move_base navigation stack. Maintained by Steve Macenski (formerly Samsung Research America), who now consults through Open Navigation LLC.

Nav2 bundles together:

  • Global planner — start-to-goal pathfinding (Navfn, SmacPlanner, ThetaStar).
  • Local planner (Controller) — path following with obstacle avoidance (DWB, RPP, MPPI).
  • Recovery Behaviors — back up, spin when stuck.
  • BT Navigator — Behavior Tree controlling the whole flow.
  • Costmap 2D — global and local cost maps.
  • Lifecycle nodes — configure/activate/deactivate state machine.

3.2 Behavior Trees for navigation flow

Nav2's signature feature is the Behavior Tree (BT). The whole navigation flow is defined in XML.

<!-- navigate_to_pose_w_replanning_and_recovery.xml -->
<root main_tree_to_execute="MainTree">
  <BehaviorTree ID="MainTree">
    <RecoveryNode number_of_retries="6" name="NavigateRecovery">
      <PipelineSequence name="NavigateWithReplanning">
        <RateController hz="1.0">
          <RecoveryNode number_of_retries="1" name="ComputePathToPose">
            <ComputePathToPose goal="{goal}" path="{path}" planner_id="GridBased"/>
            <ReactiveFallback name="ComputePathToPoseRecoveryFallback">
              <GoalUpdated/>
              <ClearEntireCostmap name="ClearGlobalCostmap-Context" service_name="global_costmap/clear_entirely_global_costmap"/>
            </ReactiveFallback>
          </RecoveryNode>
        </RateController>
        <RecoveryNode number_of_retries="1" name="FollowPath">
          <FollowPath path="{path}" controller_id="FollowPath"/>
          <ReactiveFallback name="FollowPathRecoveryFallback">
            <GoalUpdated/>
            <ClearEntireCostmap name="ClearLocalCostmap-Context" service_name="local_costmap/clear_entirely_local_costmap"/>
          </ReactiveFallback>
        </RecoveryNode>
      </PipelineSequence>
      <ReactiveFallback name="RecoveryFallback">
        <GoalUpdated/>
        <RoundRobin name="RecoveryActions">
          <Sequence name="RecoveryActionsSequence">
            <ClearEntireCostmap name="ClearingActions" service_name="global_costmap/clear_entirely_global_costmap"/>
            <Spin spin_dist="1.57"/>
            <Wait wait_duration="5"/>
            <BackUp backup_dist="0.30" backup_speed="0.05"/>
          </Sequence>
        </RoundRobin>
      </ReactiveFallback>
    </RecoveryNode>
  </BehaviorTree>
</root>

BT win: reconfigure behavior by editing XML, no code changes. Visualizes nicely for debugging (Groot, BehaviorTree.CPP tools).

3.3 MPPI Controller — the 2024 standard

Nav2 has several local controllers, but the de facto standard since 2024 is the MPPI (Model Predictive Path Integral) Controller. Sampling-based MPC that simulates thousands of trajectories in parallel and picks the lowest-cost one.

# nav2_params.yaml (MPPI section)
controller_server:
  ros__parameters:
    controller_plugins: ["FollowPath"]
    FollowPath:
      plugin: "nav2_mppi_controller::MPPIController"
      time_steps: 56
      model_dt: 0.05
      batch_size: 2000
      vx_std: 0.2
      vy_std: 0.2
      wz_std: 0.4
      vx_max: 0.5
      vx_min: -0.35
      vy_max: 0.5
      wz_max: 1.9
      iteration_count: 1
      prune_distance: 1.7
      transform_tolerance: 0.1
      temperature: 0.3
      gamma: 0.015
      motion_model: "DiffDrive"
      visualize: false
      critics:
        - "ConstraintCritic"
        - "ObstaclesCritic"
        - "GoalCritic"
        - "GoalAngleCritic"
        - "PathAlignCritic"
        - "PathFollowCritic"
        - "PathAngleCritic"
        - "PreferForwardCritic"

MPPI handles avoidance and cornering more naturally than DWB or Regulated Pure Pursuit, and gets faster with GPU acceleration. Downside: many parameters to tune.

3.4 SLAM — two common companions

For unmapped environments, run SLAM alongside Nav2.

  • SLAM Toolbox (Steve Macenski) — 2D LiDAR based, lifelong mapping + localization. Most widely used.
  • RTAB-Map — RGB-D camera based 3D SLAM. ROS 2 supported.
# Run SLAM Toolbox async mode (mapping)
ros2 launch slam_toolbox online_async_launch.py \
  params_file:=./config/mapper_params_online_async.yaml \
  use_sim_time:=true

# Save the map
ros2 service call /slam_toolbox/save_map slam_toolbox/srv/SaveMap "{name: {data: 'my_map'}}"

4. MoveIt 2 — motion planning

4.1 What MoveIt does

MoveIt is a motion-planning framework for robot arms (manipulators). Input: "send the end-effector to (x, y, z, roll, pitch, yaw)". Output: a joint trajectory that reaches that pose without collisions.

Internally:

  1. IK (Inverse Kinematics) — solve joint angles for the target pose.
  2. Collision checking — self-collision plus environment collisions.
  3. Trajectory planning — OMPL / STOMP / CHOMP / Pilz algorithms.
  4. Trajectory execution — send to a controller on top of ros2_control.

4.2 MoveIt Setup Assistant

The starting point for adding a new robot to MoveIt is the Setup Assistant GUI.

ros2 launch moveit_setup_assistant setup_assistant.launch.py

Load a URDF / Xacro and the GUI lets you:

  • Auto-generate the Self-Collision Matrix.
  • Define Planning Groups (e.g., arm + gripper).
  • Register pre-defined poses (home, ready).
  • Specify end effectors.
  • Configure controllers and sensors.

The result is a moveit_config package.

4.3 MoveIt 2 example — pick and place

# Python MoveIt 2 (moveit_py)
import rclpy
from moveit.planning import MoveItPy
from geometry_msgs.msg import PoseStamped

def main():
    rclpy.init()
    moveit = MoveItPy(node_name="moveit_py_demo")
    arm = moveit.get_planning_component("manipulator")

    target_pose = PoseStamped()
    target_pose.header.frame_id = "base_link"
    target_pose.pose.position.x = 0.4
    target_pose.pose.position.y = 0.1
    target_pose.pose.position.z = 0.5
    target_pose.pose.orientation.w = 1.0

    arm.set_start_state_to_current_state()
    arm.set_goal_state(pose_stamped_msg=target_pose, pose_link="tool0")
    plan_result = arm.plan()

    if plan_result:
        robot_trajectory = plan_result.trajectory
        moveit.execute(robot_trajectory, controllers=[])
    else:
        print("Planning failed")

    moveit.shutdown()
    rclpy.shutdown()

if __name__ == "__main__":
    main()

The C++ API uses MoveGroupInterface most often. Both wrap the same backend (OMPL).

4.4 OMPL vs STOMP vs Pilz

PlannerTypeStrengthsWeaknesses
OMPL (RRTConnect, PRM)SamplingHigh-dim friendlyJagged paths, post-processing needed
STOMPOptimizationSmooth pathsNeeds initial solution, slow
CHOMPOptimizationSmoothnessLocal optima
Pilz Industrial Motion PlannerDeterministicLIN/CIRC/PTP, industrialBad for general motion

Start with RRTConnect by default; use Pilz for repetitive industrial motions.

4.5 Connection to ros2_control

MoveIt 2 only plans — execution is handled by ros2_control. The standard interface between them is the JointTrajectoryController.

# controller_manager config
controller_manager:
  ros__parameters:
    update_rate: 100
    joint_trajectory_controller:
      type: joint_trajectory_controller/JointTrajectoryController

joint_trajectory_controller:
  ros__parameters:
    joints:
      - shoulder_pan_joint
      - shoulder_lift_joint
      - elbow_joint
      - wrist_1_joint
      - wrist_2_joint
      - wrist_3_joint
    command_interfaces:
      - position
    state_interfaces:
      - position
      - velocity

5. Gazebo Harmonic — the standard simulator

5.1 Death of Gazebo Classic, birth of New Gazebo

What ROS users casually call "Gazebo" is actually two different pieces of software.

  • Gazebo Classic (gazebo7~gazebo11) — Started 2002, EOL January 2025.
  • New Gazebo (Citadel, Edifice, Fortress, Garden, Harmonic) — A rewrite from Ignition Robotics. In 2022, the project dropped the Ignition brand and reclaimed the Gazebo name.

As of 2026, new projects unconditionally pick New Gazebo Harmonic (2023.09 LTS, supported through 2028).

5.2 Gazebo Harmonic highlights

  • gz-sim — simulation engine, ECS (Entity-Component-System) architecture.
  • DART as the default physics engine (Bullet and ODE selectable).
  • OGRE 2 renderer — Metal / Vulkan / OpenGL.
  • SDFormat 1.10 — world and robot description format.
  • GUI — Qt-based, pluggable.
  • ROS 2 integrationros_gz_bridge for topic mapping.

5.3 SDF world example

<?xml version="1.0"?>
<sdf version="1.10">
  <world name="warehouse">
    <physics name="1ms" type="ignored">
      <max_step_size>0.001</max_step_size>
      <real_time_factor>1.0</real_time_factor>
    </physics>

    <plugin filename="gz-sim-physics-system" name="gz::sim::systems::Physics"/>
    <plugin filename="gz-sim-sensors-system" name="gz::sim::systems::Sensors">
      <render_engine>ogre2</render_engine>
    </plugin>
    <plugin filename="gz-sim-scene-broadcaster-system" name="gz::sim::systems::SceneBroadcaster"/>

    <include>
      <uri>https://fuel.gazebosim.org/1.0/OpenRobotics/models/Sun</uri>
    </include>
    <include>
      <uri>https://fuel.gazebosim.org/1.0/OpenRobotics/models/Ground Plane</uri>
    </include>

    <include>
      <name>turtlebot</name>
      <pose>0 0 0.1 0 0 0</pose>
      <uri>https://fuel.gazebosim.org/1.0/OpenRobotics/models/Turtlebot4</uri>
    </include>
  </world>
</sdf>

5.4 ros_gz_bridge to connect ROS 2 topics

Gazebo topics use Gazebo Transport (a separate middleware); ROS 2 uses DDS. ros_gz_bridge connects them.

# Map /cmd_vel bidirectionally
ros2 run ros_gz_bridge parameter_bridge \
  /cmd_vel@geometry_msgs/msg/Twist@gz.msgs.Twist \
  /odom@nav_msgs/msg/Odometry[gz.msgs.Odometry \
  /scan@sensor_msgs/msg/LaserScan[gz.msgs.LaserScan

5.5 Limits — why Isaac Sim came alongside

Gazebo Harmonic is excellent, but has limits.

  • GPU utilization — Rendering uses the GPU, but physics is still CPU. Hard to run dozens or hundreds of parallel environments.
  • Photorealistic rendering — Insufficient for vision-training synthetic data.
  • Soft body / fluids — Limited.

Isaac Sim fills this gap.


6. Isaac Sim 5.0 (NVIDIA Omniverse) + Isaac ROS

6.1 What sets Isaac Sim apart

NVIDIA Isaac Sim is a robot simulator built on Omniverse. Isaac Sim 5.0 shipped in January 2026, and its core features are:

  • PhysX 5 — GPU-accelerated physics, thousands of parallel environments.
  • RTX path tracing — photorealistic rendering, optimal for ML synthetic data.
  • OpenUSD scene graph — the standard pushed by Pixar, Apple, and Adobe together.
  • Isaac Lab — reinforcement learning framework, Gym-compatible.
  • GR00T integration — humanoid foundation model data generation.

6.2 Isaac Lab reinforcement learning code (simple example)

# Building a Cartpole RL env with Isaac Lab (conceptual)
from omni.isaac.lab.envs import ManagerBasedRLEnv
from omni.isaac.lab.scene import InteractiveSceneCfg
from omni.isaac.lab.assets import ArticulationCfg
from omni.isaac.lab.terrains import TerrainImporterCfg

class CartpoleSceneCfg(InteractiveSceneCfg):
    cartpole = ArticulationCfg(
        prim_path="/World/envs/env_.*/Cartpole",
        usd_path="omniverse://localhost/NVIDIA/Assets/Robots/Cartpole/cartpole.usd",
    )

env_cfg = ManagerBasedRLEnvCfg(
    scene=CartpoleSceneCfg(num_envs=4096, env_spacing=4.0),
    sim={"dt": 1/120, "device": "cuda:0"},
)

env = ManagerBasedRLEnv(cfg=env_cfg)

# Train with PPO
from skrl.agents.torch.ppo import PPO
agent = PPO(models=..., memory=..., cfg={"learning_epochs": 8})
agent.train(env, timesteps=1_000_000)

Run 4096 parallel environments on a single RTX 4090 and a policy converges in a few days. Pulling this off in Gazebo would take dozens of CPU servers.

6.3 Isaac ROS — NVIDIA's ROS 2 accelerators

Isaac ROS is a separate suite from Isaac Sim — NVIDIA's collection of CUDA-accelerated ROS 2 packages.

PackageFunctionAcceleration
isaac_ros_visual_slamRGB-D visual SLAMGPU
isaac_ros_nvblox3D reconstruction / occupancyGPU
isaac_ros_apriltagAprilTag detectionGPU
isaac_ros_centerpose6DOF pose estimationTensorRT
isaac_ros_dnn_image_encodercamera → ML preprocessingGPU

Optimized for Jetson Orin / Thor boards, the same perception pipeline runs 10~30x faster than on CPU.

6.4 OpenUSD comes to robotics

OpenUSD (Universal Scene Description) was created by Pixar and standardized as AOUSD (Alliance for OpenUSD) in 2023. It started in film and games, but NVIDIA is pushing it as the standard for robotics.

  • Represents scenes, assets, materials, and physics as a single graph.
  • Layering (base + variant) — multiple teams can work on the same scene concurrently.
  • Python API.
  • Isaac Sim, Blender, Maya, Houdini all support it.

URDF / SDF are fine for defining a single robot, but weak for large environments like factories or cities. USD likely replaces them there.


7. MuJoCo (DeepMind) — physics simulation

7.1 The MuJoCo story — Roboti LLC → DeepMind acquisition → open source

MuJoCo (Multi-Joint dynamics with Contact) is a physics engine Emo Todorov built in the early 2010s. Originally paid commercial software (Roboti LLC), it powered some OpenAI Gym environments (Humanoid, HalfCheetah). Student licenses existed but commercial ones were expensive.

In October 2021, Google DeepMind acquired MuJoCo and made it fully free and open source (Apache 2.0). That single move dramatically lowered the barrier for reinforcement learning and robotics research.

7.2 MuJoCo strengths

  • Contact dynamics — Friction and impact are more accurate and stable than in other engines.
  • Speed — Optimized for parallel simulation, with a GPU backend (MJX).
  • Determinism — Same input, same output (critical for RL reproducibility).
  • MJCF — XML-based model format, more concise than URDF.

7.3 MJCF example (humanoid)

<mujoco model="humanoid">
  <option timestep="0.005" gravity="0 0 -9.81"/>

  <default>
    <joint armature="1" damping="1" limited="true"/>
    <geom contype="0" conaffinity="0" friction="1 0.1 0.1" rgba="0.7 0.5 0.3 1"/>
  </default>

  <worldbody>
    <light pos="0 0 3" dir="0 0 -1" diffuse="0.7 0.7 0.7"/>
    <geom name="floor" pos="0 0 0" size="10 10 0.1" type="plane" rgba="0.9 0.9 0.9 1"/>

    <body name="torso" pos="0 0 1.4">
      <joint name="root" type="free"/>
      <geom name="torso_geom" type="capsule" fromto="0 -.07 0 0 .07 0" size=".07"/>

      <body name="head" pos="0 0 .19">
        <geom name="head_geom" type="sphere" size=".09"/>
      </body>

      <body name="right_thigh" pos="0 -.1 -.04">
        <joint name="right_hip" type="ball"/>
        <geom name="right_thigh_geom" type="capsule" fromto="0 0 0 0 .01 -.34" size=".06"/>

        <body name="right_shin" pos="0 .01 -.403">
          <joint name="right_knee" type="hinge" axis="0 -1 0" range="-160 -2"/>
          <geom name="right_shin_geom" type="capsule" fromto="0 0 0 0 0 -.3" size=".049"/>
        </body>
      </body>
      <!-- left side symmetric -->
    </body>
  </worldbody>

  <actuator>
    <motor name="right_knee" joint="right_knee" gear="200"/>
  </actuator>
</mujoco>

7.4 MJX — JAX backend for GPU acceleration

Shipped in 2024, MJX is MuJoCo on JAX — simulate tens of thousands of environments on GPU / TPU at once. Same category as Brax (Google) or IsaacGym.

import jax
import mujoco
from mujoco import mjx

mj_model = mujoco.MjModel.from_xml_path("humanoid.xml")
mjx_model = mjx.put_model(mj_model)

# Simulate 4096 envs in parallel
batch_size = 4096
keys = jax.random.split(jax.random.PRNGKey(0), batch_size)

@jax.jit
@jax.vmap
def reset(key):
    data = mjx.make_data(mjx_model)
    return data

@jax.jit
@jax.vmap
def step(data, action):
    data = data.replace(ctrl=action)
    data = mjx.step(mjx_model, data)
    return data

batch_data = reset(keys)
actions = jax.numpy.zeros((batch_size, mj_model.nu))

for _ in range(1000):
    batch_data = step(batch_data, actions)

7.5 MuJoCo Menagerie — official robot collection

The MuJoCo Menagerie repo maintained by DeepMind has validated MJCFs for industrial robots.

  • Unitree H1, G1, Go2
  • Boston Dynamics Spot
  • Franka Emika Panda
  • KUKA iiwa14
  • Shadow Dexterous Hand
  • Anybotics Anymal C
  • Universal Robots UR5e / UR10e

Drop them in for RL or imitation-learning simulations directly.


8. Webots (Cyberbotics) — the lightweight alternative

8.1 Webots' niche

Webots is a robot simulator built since 1996 by Swiss company Cyberbotics. Open-sourced (Apache 2.0) in 2018.

Why use Webots when Gazebo, Isaac Sim, and MuJoCo exist?

  • Lightweight — Gazebo Harmonic is a pain to install; Webots is a single package and done.
  • Education — Standard in competitions like RoboCup, DARPA Subterranean, DJI RoboMaster.
  • Rich built-in robots — Around 100 robot models bundled (e-puck, NAO, Pioneer, TurtleBot).
  • C / C++ / Python / Java / MATLAB / ROS / ROS 2 all supported.

8.2 Webots controller example

# Webots Python controller
from controller import Robot

TIME_STEP = 32
MAX_SPEED = 6.28

robot = Robot()
left_motor = robot.getDevice("left wheel motor")
right_motor = robot.getDevice("right wheel motor")
left_motor.setPosition(float('inf'))
right_motor.setPosition(float('inf'))

ds_left = robot.getDevice("ds_left")
ds_right = robot.getDevice("ds_right")
ds_left.enable(TIME_STEP)
ds_right.enable(TIME_STEP)

while robot.step(TIME_STEP) != -1:
    left = ds_left.getValue()
    right = ds_right.getValue()
    if left < 100:
        left_motor.setVelocity(-MAX_SPEED * 0.5)
        right_motor.setVelocity(MAX_SPEED * 0.5)
    elif right < 100:
        left_motor.setVelocity(MAX_SPEED * 0.5)
        right_motor.setVelocity(-MAX_SPEED * 0.5)
    else:
        left_motor.setVelocity(MAX_SPEED)
        right_motor.setVelocity(MAX_SPEED)

8.3 Which simulator for which job

Use caseRecommended
Learning, tutorials, educationWebots
Industrial mobile robots (ROS 2 integration)Gazebo Harmonic
RL, humanoids, synthetic dataIsaac Sim 5.0
Manipulation, dexterity, contact-rich tasksMuJoCo
Fast parallel RLMJX or Isaac Lab
Multi-robot, drone swarmsGazebo + Zenoh

9. Hugging Face LeRobot (2024.10) — democratizing robotics ML

9.1 Why LeRobot emerged

In October 2024, Hugging Face launched LeRobot. Catchphrase: "Robotics for everyone". Core idea — bundle these together:

  • Pre-trained models — π0, ACT, Diffusion Policy, TDMPC, etc.
  • Dataset hub — Open X-Embodiment, BridgeData V2, DROID in hf-datasets format.
  • Affordable hardware kits — SO-100 (around 200 USD), SO-101, Koch v1.1, Stanford Mobile Aloha clones.
  • Tutorials — Run a full imitation-learning cycle in Colab in 30 minutes.

The point is full integration with PyTorch + the Hugging Face ecosystem. Same play Transformers ran on NLP, now for robotics.

9.2 LeRobot data pipeline

# Load a LeRobot dataset
from lerobot.common.datasets.lerobot_dataset import LeRobotDataset

dataset = LeRobotDataset("lerobot/aloha_static_coffee")
print(dataset.meta.info)

# Grab one episode
episode = dataset[0]
print(episode["observation.image.cam_high"].shape)
print(episode["action"].shape)

9.3 ACT imitation learning

ACT (Action Chunking Transformer) is the imitation-learning baseline from the Stanford Mobile Aloha team. Trainable directly inside LeRobot.

# Collect data with an SO-100 arm
python lerobot/scripts/control_robot.py record \
  --robot.type=so100 \
  --control.type=record \
  --control.fps=30 \
  --control.repo_id=youngju/so100_pick_and_place \
  --control.num_episodes=50 \
  --control.warmup_time_s=5 \
  --control.episode_time_s=30 \
  --control.push_to_hub=true

# Train the ACT policy
python lerobot/scripts/train.py \
  policy=act \
  dataset_repo_id=youngju/so100_pick_and_place \
  env=so100 \
  training.offline_steps=80_000 \
  training.batch_size=32 \
  training.lr=1e-4

# Evaluate on the real robot
python lerobot/scripts/eval.py \
  -p youngju/so100_pick_and_place_act \
  eval.n_episodes=10

9.4 SO-100 — robot learning that starts at 200 USD

SO-100 (originally designed by The Robot Studio) is a 5DOF desktop arm whose parts cost around 200~300 USD. Built from 3D-printed parts plus Dynamixel (Robotis) or Feetech servos. LeRobot ships SO-100 / SO-101 drivers out of the box, so the data collection / training / execution cycle runs at home.

That's LeRobot's real impact. Just as NLP was expensive and gated before the ChatGPT API, robotics research only happened on top of a 50K USD industrial arm. Now that floor is 300 USD.


10. NVIDIA GR00T — humanoid foundation model

10.1 Project GR00T begins

Jensen Huang announced Project GR00T (Generalist Robot 00 Technology) at GTC March 2024. In one line: "GPT for humanoids".

The project bundles three things.

  1. GR00T N1 (2025.03) — 17B parameter VLA, first publicly released weights.
  2. Isaac Lab + Cosmos — simulation and synthetic-data generation.
  3. Jetson Thor — humanoid-onboard inference SoC, shipped 2025.

10.2 GR00T N1 architecture

GR00T N1's core is a System 1 / System 2 dual mode (borrowed directly from Kahneman).

  • System 1 (fast) — Diffusion Transformer, action output at 100Hz. Reflexive motor control.
  • System 2 (slow) — VLM (NVILA), reasoning and planning at 1~10Hz. Environment understanding.

The two connect through token cross-attention. Same as how a human catches a ball — slow planning (System 2: "ball falling → reach hand") and fine-grained finger adjustment (System 1) happen simultaneously.

10.3 GR00T data pyramid

NVIDIA's data strategy:

        Real robot demonstrations (tens to hundreds of hours)
            up   (least amount, most expensive)
            |
        Tele-op (VR like Apple Vision Pro)
            |
        Simulation (Isaac Sim parallel envs)
            |
        Cosmos synthetic video (millions of hours)
            |   (most amount, cheapest)
        Internet video / documents

Top to bottom: less quantity, higher quality. GR00T pre-trains on internet video, augments with Cosmos synthetic data, fine-tunes via RL in Isaac Sim, and finishes with real demonstrations.

10.4 GR00T inference code (conceptual)

# GR00T inference (conceptual interface)
from gr00t import GR00TPolicy

policy = GR00TPolicy.from_pretrained("nvidia/gr00t-n1")

observation = {
    "video.image_left": ...,    # left camera
    "video.image_right": ...,   # right camera
    "state.joint_pos": ...,     # 28 DOF humanoid
    "language.instruction": "Pick up the red block and place it on the shelf"
}

# System 2 plans at 5Hz, System 1 acts at 50Hz
for t in range(simulation_steps):
    action = policy.predict(observation)  # 28D action vector
    new_observation = robot.step(action)
    observation = new_observation

10.5 Other humanoid foundation models

  • Figure Helix (2025.02) — Figure AI's in-house model after leaving OpenAI partnership.
  • 1X World Model — Video-prediction model for the home robot.
  • Tesla Optimus FSD — Self-driving stack ported to humanoids.
  • Physical Intelligence π0 — Next-generation π0.5 in development.

11. VLA models — RT-2, OpenVLA, π0

11.1 What is a VLA

A VLA (Vision-Language-Action) model takes an image (V) plus a language instruction (L) and outputs action (A) tokens. The next-token prediction pattern from LLMs, applied to actions.

[ Image frames ] -> vision encoder ----+
[ "Pick up apple" ] -> tokenizer ------+-> Transformer -> [ action tokens ]
[ Robot state ] -> projector ----------+                       |
                                                               v
                                                  [ dx, dy, dz, gripper ]

11.2 RT-2 (2023.07) — first signal

Google DeepMind's RT-2 was a VLM (PaLI-X 55B, PaLM-E 12B) fine-tuned with RT-1 robot data. The trick: encode actions as tokens and add them to the LLM vocabulary.

  • Input: camera image + "Move the empty can to the apple"
  • Output: <action>+13 +127 -3 +1 0 0 0 1</action> (6DOF delta plus gripper)

What stunned people: it generalized to never-seen objects (extinct animal cards) and never-seen commands ("move the puppy picture").

11.3 OpenVLA (2024.06) — first open-source VLA

Stanford, UC Berkeley, and Toyota Research released OpenVLA (7B), based on the Prismatic VLM.

  • Trained on Open X-Embodiment 970K trajectories.
  • 14 days of training on 64 A100s.
  • Weights, code, data all public.
  • LoRA-finetunable to new robots and tasks.
from transformers import AutoModelForVision2Seq, AutoProcessor
from PIL import Image
import torch

processor = AutoProcessor.from_pretrained("openvla/openvla-7b", trust_remote_code=True)
vla = AutoModelForVision2Seq.from_pretrained(
    "openvla/openvla-7b",
    attn_implementation="flash_attention_2",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
).to("cuda:0")

image = Image.open("camera_view.png")
prompt = "In: What action should the robot take to pick up the cup?\nOut:"
inputs = processor(prompt, image).to("cuda:0", dtype=torch.bfloat16)

action = vla.predict_action(**inputs, unnorm_key="bridge_orig", do_sample=False)
print(action)  # [dx, dy, dz, droll, dpitch, dyaw, gripper]

11.4 π0 (Physical Intelligence, 2024.10)

Physical Intelligence (PI) was founded by Sergey Levine, Chelsea Finn, and others. π0 (pi-zero) is a 7B parameter flow-matching VLA.

The core innovation: action chunking + flow matching:

  • Generate a 50ms-long action chunk (50 steps) per inference.
  • Flow matching instead of diffusion for smooth action distributions.
  • Dexterity demos: folding clothes, washing dishes, packing boxes.
# π0 inference pseudocode
from openpi.models.pi0 import Pi0Policy

policy = Pi0Policy.from_checkpoint("pi0_base")
action_chunk = policy.sample_action(
    images={"primary": img1, "wrist": img2},
    instruction="Fold the blue shirt",
    state=robot_state,
)
# action_chunk: shape [50, action_dim]
for action in action_chunk:
    robot.execute(action)

In April 2025 PI released π0.5 and π1; some weights are on Hugging Face.

11.5 VLA model comparison

ModelByParamsDataOpen
RT-2Google DeepMind55B (PaLI-X)RT-1 + webClosed
OpenVLAStanford / Berkeley7BOXE 970KFully open
π0Physical Intelligence7BOXE + ownPartially open
GR00T N1NVIDIA17BSynthetic + realWeights open
Figure HelixFigure AIClosedOwn dataClosed
Tesla OptimusTeslaClosedOwn dataClosed

11.6 Limits and what comes next

VLAs don't solve everything. As of 2026:

  1. Long-horizon tasks — Difficult beyond 30 seconds. Planning (System 2) is weak.
  2. Data efficiency — New tasks need tens to hundreds of demonstrations. One-shot is far off.
  3. Safety guarantees — Formally certifying neural policy safety is hard.
  4. Hardware-software coupling — The same model rarely transfers well between robots.

12. Foxglove Studio — data visualization

12.1 RViz2's limits, Foxglove's arrival

ROS 2's default visualization is RViz2. Powerful enough, but limited.

  • ROS 2 dependent — Hard to inspect non-ROS systems (MAVLink, custom protocols).
  • Weak log (rosbag) analysis.
  • Hard to collaborate or share (only screenshots and recordings).
  • No web or tablet support.

Foxglove Studio is a multi-protocol visualization tool from a company founded by ex-Cruise engineers in 2021.

  • Desktop + web + cloud supported simultaneously.
  • ROS 1, ROS 2, MCAP, MAVLink, MQTT, raw WebSocket all supported.
  • 3D, Plot, Image, State Transitions, Map, Tab and many other panels.
  • MCAP has become the de facto new standard (rosbag2 replacement).

12.2 MCAP — the new log standard

MCAP (Message Capture) is a log container format from Foxglove. As of ROS 2 Jazzy, rosbag2's default storage is MCAP.

Features:

  • Indexed for fast random access.
  • Embeds message definitions, so it's self-describing.
  • Supports ROS, protobuf, JSON, flatbuffers.
  • Python / Go / Rust SDKs.
# Record rosbag2 as MCAP (default in Jazzy)
ros2 bag record -a -s mcap -o my_session

# Inspect MCAP info
mcap info my_session/my_session_0.mcap

# Convert
mcap convert old.bag new.mcap

12.3 Viewing data in Foxglove

# Live ROS 2 topics in desktop app
# - Launch Foxglove Studio
# - + Add connection -> ROS 2 -> auto-discover

# Or drag-drop an MCAP file in the web app
# https://app.foxglove.dev

Collaboration wins: teammates open the same MCAP and see the same panel layout at the same timestamp. A pull request can link to "the bug at t=23.4s".

12.4 PlotJuggler — signal analysis

For numerical (sensor / controller) time-series analysis, PlotJuggler is the most convenient tool.

sudo apt install ros-jazzy-plotjuggler-ros
ros2 run plotjuggler plotjuggler

Drag-drop CSV, rosbag2, MCAP, or live topics to plot. Easy formula-based channel creation and filtering.


13. Korea and Japan robotics — from industrial powerhouses to humanoids

13.1 Korea — Hyundai's Boston Dynamics bet

In December 2021, Hyundai Motor Group bought Boston Dynamics from SoftBank Group for 1.1 billion USD. Many called it expensive at the time, but with the humanoid boom the bet's meaning has crystallized.

  • Atlas (Electric) — Hydraulic Atlas retired April 2024, new electric Atlas unveiled. Learning vehicle assembly at Hyundai's Metaplant in Georgia.
  • Spot — Quadruped, used for security and inspection at Hyundai and Kia factories.
  • Stretch — Warehouse robot for box handling.

Other Korean players:

  • Doosan Robotics — Collaborative robots (cobot) M-series and H-series. Famous for applications like Korean chicken-frying robots.
  • Rainbow Robotics — Spinoff from KAIST's HUBO project; Samsung Electronics acquired a stake in 2024. Developing the RB-Y1 humanoid.
  • Robotis — Dynamixel smart actuators. The de facto standard inside LeRobot SO-100, OpenManipulator, and global robotics education kits.
  • Yujin Robot, Robotis — Mobile robots and educational platforms.

13.2 Japan — long-standing industrial robotics powerhouse

Japan has been world number one in industrial robotics since the 1980s. In 2026 that position has not changed.

Industrial robot Big Four (2026):

  • FANUC — Yellow industrial arms, the standard on automotive and semiconductor lines.
  • Yaskawa Motoman — Welding and assembly.
  • Kawasaki Robotics — Parent is the motorcycle company.
  • Mitsubishi Electric — MELFA series.

Service and humanoid side:

  • Sony AIBO — First released 1999, revived in 2018 (ERS-1000). Icon of home robot pets.
  • SoftBank Pepper — Launched 2014, production stopped 2021. In 2026 only seen in museums.
  • Honda ASIMO — Development ended 2018. The end of the first humanoid generation.
  • Toyota T-HR3 / Human Support Robot (HSR) — Research and elder-care robots.
  • Kawasaki Kaleido — Humanoid announced in 2022.
  • Preferred Networks — AI / robotics startup, partnered with FANUC.

13.3 Different textures of the two countries

KoreaJapan
StrengthsHumanoid (BD), cobot (Doosan), motors (Robotis)Industrial arms (FANUC, Yaskawa), precision motors, automotive
WeaknessesLate entry to industrial standard marketSlow on humanoid / AI integration
R&DGovernment-led, big-corp centeredCorporate-led, craftsman culture
AI integrationFast (rapid adoption of LLM, VLA)Conservative

A interesting 2026 trend: the Hyundai-Boston Dynamics-Hyundai Mobis-Hyundai Wia axis has become almost the only non-US bloc that can compete head-on with US Big Tech (Tesla, Figure). Japan remains strong in industrial but is in catch-up mode for the humanoid / VLA race.

13.4 Learning resources

  • Korea — KAIST and SNU robotics labs, Hyundai Robotics Academy, Woowa Brothers robot team.
  • Japan — Preferred Networks challenges, Toyota Research Institute, AIST (National Institute of Advanced Industrial Science and Technology).

14. References

ROS 2 official / Nav2 / MoveIt

Simulation

VLA / foundation models

Visualization / data

Humanoid platforms

Korea / Japan


Closing — robotics is really changing

In 2020, robotics development was "ROS 1 + Gazebo Classic + hand-written controllers + a PhD". The entry barrier was a PhD.

In 2026, robotics development is "ROS 2 Jazzy + Isaac Sim / MuJoCo + a VLA pulled from LeRobot + a 300 USD SO-100". The entry barrier is a laptop and curiosity.

Three forces driving the change:

  1. Simulation maturity — Isaac Sim, MuJoCo, and Gazebo Harmonic deliver industrial-grade quality for free.
  2. VLAs arriving — Perception, planning, and control once hand-engineered now compress into a single model.
  3. Democratization — LeRobot bundles data, models, and hardware in one open package.

Whether all this actually puts a humanoid in our living rooms, or whether it becomes another "self-driving in 5 years" overhype, will be answered in 2027~2028. But one thing is clear — in 2026 we wait for that answer with a tool pile that bears no resemblance to what we had a year ago.

Install ROS 2. Run one MuJoCo simulation. Follow one LeRobot ACT tutorial. That's the lightest way to put a foot into 2026 robotics.