Back to blog

Bimanual manipulation video: what makes the data useful

Two-hand tasks expose coordination, contact, object state, and occlusion challenges that single-action datasets rarely capture well.

Embodied AI Data Labs 7 min read
Bimanual manipulation video: what makes the data useful

Bimanual tasks are coordination problems

Folding fabric, tying, opening packaging, assembling parts, and preparing food require hands to take different roles over time. One hand may stabilize while the other manipulates, then the roles may switch.

Datasets should preserve these transitions instead of reducing the task to one broad activity label.

Capture detail without losing context

A first-person view can reveal finger placement and contact, while a fixed view preserves the complete workspace and object geometry. Multi-view capture is especially useful when one hand frequently occludes the other.

Task scripts should define acceptable variation while allowing natural execution so models see realistic coordination.

Annotate the signals a policy can use

Useful labels may include left and right hand roles, contact start and end, object state, task phase, success status, and quality flags. These fields support both retrieval and targeted evaluation.

Before scaling, teams should inspect whether the labels remain consistent across difficult tasks and different annotators.

Need human task data your robots can learn from?

Share the task, environment, capture setup, and target volume. We will map the fastest sample or pilot path.

Request Sample