Home

Datasets

Real-world robot datasets used in VLA research

Skill = atomic action primitives (pick, place, reach)  ·  Task = instruction-level goals  ·  Modality: RGB · D = depth · L = language · F = force/torque · A = audio · T = touch

Dataset Episodes Skills Tasks Modality Embodiment Collection
QT-Opt 580K 1 (Pick) RGB KUKA LBR iiwa Learned
MT-Opt 800K 2 12 RGBL 7 robots Scripted Learned
RoboNet 162K RGB 7 robots Scripted
BridgeData 7.2K 4 71 RGBL WidowX 250 Teleop
BridgeData V2 60.1K 13 RGB-DL WidowX 250 Teleop
BC-Z 26.0K 3 100 RGBL Google EDR Teleop
Language Table 413K 1 (Push) RGBL xArm Teleop
RH20T 110K 42 147 RGB-DLFA 4 robots Teleop
RT-1 130K 12 700+ RGBL Google EDR Teleop
OXE 1.4M 527 160,266 RGB-DL 22 robots Mixed
DROID 76K 86 RGB-DL Franka Panda Teleop
FuSe 27K 2 3 RGBLTA WidowX 250 Teleop
RoboMIND 107K 38 479 RGB-DL 4 robots Teleop
AgiBot World 1M 87 217 RGB-DL AgiBot G1 Teleop

Adapted from the survey (Table 1). Statistics as reported in original papers.