Real2Sim2Real Tactile Policy Learning

Blind Dexterous Grasping via Real2Sim2Real Tactile Policy Learning

A tactile-only dexterous grasping policy for a sensorized LEAP Hand: calibrated contact simulation, geometry-aware tactile representation, and diffusion-based policy distillation.

44tactile channels
20real objects
0visual input
Zero-shot sim-to-real transfer of a tactile-conditioned control policy for reactive blind grasping

Zero-shot sim-to-real transfer of a tactile-conditioned control policy for reactive blind grasping. The deployed policy performs blind object interaction—searching, adjusting contacts, and lifting—without any visual input, using only robot state and sparse tactile feedback.

Abstract

Blind grasping with a dexterous hand is a crucial manipulation capability. Nevertheless, learning such tactile-only policies for real robots remains challenging due to the tactile sim-to-real gap and the limited expressiveness of sparse tactile signals.

To bridge this gap, we propose a framework for tactile-only blind grasping that is deployable on a physical multi-fingered robotic hand. Our approach combines three key components. First, we introduce a Real2Sim tactile calibration pipeline that constructs a contact-calibrated digital-twin simulator capable of reproducing real tactile signals. Second, we improve the expressiveness of sparse tactile observations using a layout-aware tactile encoder, which incorporates sensor-geometry priors through self-supervised pretraining. Third, to improve generalization to unseen objects, we train object-specific reinforcement-learning experts in the calibrated simulator and aggregate their successful grasp trajectories into a tactile-conditioned Diffusion Policy.

We evaluate our method on a physical LEAP Hand equipped with distributed tactile sensing across 10 seen and 10 unseen objects. The deployed policy achieves a 27% real-world grasp success rate across all 20 objects, without real-world grasping demonstrations or visual input.

Key Ideas

Real2Sim tactile calibration

Aligns binary contact-event timing between simulation and hardware using lightweight task-agnostic controlled-contact motions, without requiring complex soft-body modeling or real-world grasping demonstrations.

Layout-aware tactile encoder

Grounds each sensor in 3D via hand kinematics and is pretrained with privileged geometric supervision (object pose, contact labels), injecting spatial priors into sparse binary tactile observations.

Expert-to-diffusion distillation

Decouples exploration from deployment: object-specific RL experts generate diverse contact-rich trajectories in simulation, which are aggregated to train a single tactile-conditioned Diffusion Policy capable of multimodal grasp generation under partial observability.

Supplementary Video

Blind grasping in motion

A compact reel of physical trials shows the policy searching for contact, adjusting finger placement, and lifting objects without visual input.

Policy Architecture

Our Real2Sim2Real framework consists of three stages:

Overview of the proposed Real2Sim2Real framework for blind dexterous grasping

Real2Sim2Real framework overview. Calibrated tactile simulation provides contact-rich training data; a layout-aware tactile encoder distills geometric priors; a Diffusion Policy enables multimodal grasp generation for real-world deployment.

Hardware

Our physical platform consists of a 6-DoF xArm6 manipulator equipped with a 16-DoF LEAP Hand. The hand is instrumented with distributed tactile sensing: four custom-built TwinTac sensors on the fingertips (providing 4×8=324 \times 8 = 32 binary channels) and 12 FSR channels on the palm and finger surfaces, yielding 44 tactile channels in total.

Hardware platform and tactile sensing layout

Hardware platform and tactile sensing. The LEAP Hand is instrumented with heterogeneous tactile sensors—four TwinTac fingertip sensors and distributed FSRs across palm and phalanges.

Open-source hardware. To facilitate reproducibility, we release the full hardware design of our tactile sensing suite:

Open-source hardware repository (coming soon)

Experiment Setup

We evaluate on 20 diverse objects (10 seen, 10 unseen) with the policy deployed zero-shot on the physical LEAP Hand—no real-world grasping demonstrations, no visual input, only proprioception and sparse tactile feedback.

Experiment setup with object sets and evaluation protocol

Experiment overview. (A) Tactile sensor layout. (B) Physical xArm6–LEAP Hand platform. (C) 20-object benchmark with 10 seen and 10 unseen instances.

Experiment Results

Per-object grasp success rates

Success rates for all 20 objects across 5 trials each, comparing policies with and without privileged tactile pretraining.

Seen avg. w/ pretrain 60.4%
Unseen avg. w/ pretrain 43.2%
Average improvement +23.7%

Seen Objects

10 trained object instances

60.4%
Object w/o Pretrain w/ Pretrain Improve.
Cube 60% 62% +2%
Ball 8% 56% +48%
Box 30% 48% +18%
Cross 52% 66% +14%
CubeBall 38% 100% +62%
Egg 16% 24% +8%
Big Egg 0% 8% +8%
H-shape 44% 76% +32%
Hollow Cube 70% 98% +28%
Hourglass 44% 66% +22%
Seen Avg. 36.2% 60.4% +24.2%

Unseen Objects

10 held-out object instances

43.2%
Object w/o Pretrain w/ Pretrain Improve.
Hash-Shape 22% 42% +20%
C-Shape 18% 34% +16%
E-Shape 18% 38% +20%
T-Shape 28% 56% +28%
Cylinder 2% 14% +12%
Fork 20% 62% +42%
Ring 50% 80% +30%
Snowman 22% 48% +26%
Tetrahedral 8% 28% +20%
Triple 12% 30% +18%
Unseen Avg. 20.0% 43.2% +23.2%

Limitations

Several limitations remain in the current system. First, the overall success rate is 27%—encouraging for a challenging tactile-only setting, but modest in absolute terms. Second, the hardware provides incomplete tactile coverage: contacts in unsensed regions of the hand produce ambiguous observations that can confuse the policy. Third, the fixed execution horizon does not always suffice for the policy to establish stable grasp configurations, sometimes leading to empty grasps or object drops during lifting.

Conclusion

This paper presents a complete pipeline for learning and deploying a blind grasping policy on a robotic dexterous hand. Our pipeline combines a tactile-enabled dexterous hand platform, a high-fidelity tactile simulation environment calibrated via Real2Sim, and an RL expert policy with a distillation framework for Sim2Real deployment. By jointly leveraging contact-event calibration, geometry-aware tactile representation learning, and diffusion-based policy aggregation, our system achieves promising results on a large and diverse real-world object set. Future work will focus on improving grasp robustness through denser tactile skins, explicit slip detection, and two-timescale reactive control architectures.