COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping

Abstract

This paper addresses the challenge of occluded robot grasping, i.e. grasping in situations where the desired grasp poses are kinematically infeasible due to environmental constraints such as surface collisions. Traditional robot manipulation approaches struggle with the complexity of non-prehensile or bimanual strategies commonly used by humans in these circumstances. State-of-the-art reinforcement learning (RL) methods are unsuitable due to the inherent complexity of the task. In contrast, learning from demonstration requires collecting a significant number of expert demonstrations, which is often infeasible. Instead, inspired by human bimanual manipulation strategies, where two hands coordinate to stabilise and reorient objects, we focus on a bimanual robotic setup to tackle this challenge. In particular, we introduce Constraint-based Manipulation for Bimanual Occluded Grasping (COMBO-Grasp), a learning-based approach which leverages two coordinated policies: a constraint policy trained using self-supervised datasets to generate stabilising poses and a grasping policy trained using RL that reorients and grasps the target object. A key contribution lies in value function-guided policy coordination. Specifically, during RL training for the grasping policy, the constraint policy's output is refined through gradients from a jointly trained value function, improving bimanual coordination and task performance. Lastly, COMBO-Grasp employs teacher-student policy distillation to effectively deploy point cloud-based policies in real-world environments. Empirical evaluations demonstrate that COMBO-Grasp significantly improves task success rates compared to competitive baseline approaches, with successful generalisation to unseen objects in both simulated and real-world environments.

Approach

COMBO-Grasp is a bimanual robotic system designed for grasping objects in scenarios where the desired grasp pose is occluded due to environmental constraints such as table surface. The system leverages two coordinated policies: a constraint policy that stabilizes the object using one arm and a grasping policy that reorients and grasps the object using the other arm. By integrating value function-guided policy coordination and self-supervised learning, COMBO-Grasp efficiently learns robust bimanual grasping strategies, achieving superior performance in both simulated and real-world environments.

(1) Self-supervised Teacher Constraint Policy Training

A constraint policy generates stabilising poses for one arm, enabling effective bimanual occluded grasping. The teacher constraint policy is trained in simulation using a novel self-supervised data collection method that leverages force-closure principles to ensure object stability. By collecting diverse constraint poses across multiple objects, the trained policy accelerates the learning of the grasping policy and improving overall task performance.

(2) Teacher Policy Training with Constraint Refinement

The teacher constraint policy, initially trained in a self-supervised manner, generates stabilizing poses, but these may not be optimal for grasp execution. To improve coordination, a value function-guided refinement is introduced, where gradients from the grasping policy’s value function adjust the constraint policy’s output during RL training for a grasping policy. This improves bimanual coordination, resulting in better task performance.

(3) Teacher-Student Distillation

The teacher policies, trained with privileged information, are distilled into student policies that process point clouds and proprioceptive states, enabling robust sim-to-real transfer and generalization to unseen objects.

Results

Simulated Experiments

Real-world Results

Box-Medium-Heavy (Seen)

Box-Large-Light

Box-Small-Heavy

Keyboard

Bag

Round Box

BibTeX

@article{yamada2024combograsp,
  author    = {Jun Yamada and Alexander L. Mitchell and Jack Collins and Ingmar Posner}},
  title     = {COMBO-Grasp: Learning Constraint-Based Manipulation for Bimanual Occluded Grasping},
  journal   = {arXiv preprint arXiv:2502.08054},
  year      = {2025},
}