Real2Sim tactile calibration
Aligns binary contact-event timing between simulation and hardware using lightweight task-agnostic controlled-contact motions, without requiring complex soft-body modeling or real-world grasping demonstrations.
Real2Sim2Real Tactile Policy Learning
A tactile-only dexterous grasping policy for a sensorized LEAP Hand: calibrated contact simulation, geometry-aware tactile representation, and diffusion-based policy distillation.
Zero-shot sim-to-real transfer of a tactile-conditioned control policy for reactive blind grasping. The deployed policy performs blind object interaction—searching, adjusting contacts, and lifting—without any visual input, using only robot state and sparse tactile feedback.
Blind grasping with a dexterous hand is a crucial manipulation capability. Nevertheless, learning such tactile-only policies for real robots remains challenging due to the tactile sim-to-real gap and the limited expressiveness of sparse tactile signals.
To bridge this gap, we propose a framework for tactile-only blind grasping that is deployable on a physical multi-fingered robotic hand. Our approach combines three key components. First, we introduce a Real2Sim tactile calibration pipeline that constructs a contact-calibrated digital-twin simulator capable of reproducing real tactile signals. Second, we improve the expressiveness of sparse tactile observations using a layout-aware tactile encoder, which incorporates sensor-geometry priors through self-supervised pretraining. Third, to improve generalization to unseen objects, we train object-specific reinforcement-learning experts in the calibrated simulator and aggregate their successful grasp trajectories into a tactile-conditioned Diffusion Policy.
We evaluate our method on a physical LEAP Hand equipped with distributed tactile sensing across 10 seen and 10 unseen objects. The deployed policy achieves a 27% real-world grasp success rate across all 20 objects, without real-world grasping demonstrations or visual input.
Aligns binary contact-event timing between simulation and hardware using lightweight task-agnostic controlled-contact motions, without requiring complex soft-body modeling or real-world grasping demonstrations.
Grounds each sensor in 3D via hand kinematics and is pretrained with privileged geometric supervision (object pose, contact labels), injecting spatial priors into sparse binary tactile observations.
Decouples exploration from deployment: object-specific RL experts generate diverse contact-rich trajectories in simulation, which are aggregated to train a single tactile-conditioned Diffusion Policy capable of multimodal grasp generation under partial observability.
Supplementary Video
A compact reel of physical trials shows the policy searching for contact, adjusting finger placement, and lifting objects without visual input.
Our Real2Sim2Real framework consists of three stages:
Real2Sim2Real framework overview. Calibrated tactile simulation provides contact-rich training data; a layout-aware tactile encoder distills geometric priors; a Diffusion Policy enables multimodal grasp generation for real-world deployment.
Our physical platform consists of a 6-DoF xArm6 manipulator equipped with a 16-DoF LEAP Hand. The hand is instrumented with distributed tactile sensing: four custom-built TwinTac sensors on the fingertips (providing binary channels) and 12 FSR channels on the palm and finger surfaces, yielding 44 tactile channels in total.
Hardware platform and tactile sensing. The LEAP Hand is instrumented with heterogeneous tactile sensors—four TwinTac fingertip sensors and distributed FSRs across palm and phalanges.
Open-source hardware. To facilitate reproducibility, we release the full hardware design of our tactile sensing suite:
We evaluate on 20 diverse objects (10 seen, 10 unseen) with the policy deployed zero-shot on the physical LEAP Hand—no real-world grasping demonstrations, no visual input, only proprioception and sparse tactile feedback.
Experiment overview. (A) Tactile sensor layout. (B) Physical xArm6–LEAP Hand platform. (C) 20-object benchmark with 10 seen and 10 unseen instances.
Experiment Results
Success rates for all 20 objects across 5 trials each, comparing policies with and without privileged tactile pretraining.
10 trained object instances
| Object | w/o Pretrain | w/ Pretrain | Improve. |
|---|---|---|---|
| Cube | 60% | 62% | +2% |
| Ball | 8% | 56% | +48% |
| Box | 30% | 48% | +18% |
| Cross | 52% | 66% | +14% |
| CubeBall | 38% | 100% | +62% |
| Egg | 16% | 24% | +8% |
| Big Egg | 0% | 8% | +8% |
| H-shape | 44% | 76% | +32% |
| Hollow Cube | 70% | 98% | +28% |
| Hourglass | 44% | 66% | +22% |
| Seen Avg. | 36.2% | 60.4% | +24.2% |
10 held-out object instances
| Object | w/o Pretrain | w/ Pretrain | Improve. |
|---|---|---|---|
| Hash-Shape | 22% | 42% | +20% |
| C-Shape | 18% | 34% | +16% |
| E-Shape | 18% | 38% | +20% |
| T-Shape | 28% | 56% | +28% |
| Cylinder | 2% | 14% | +12% |
| Fork | 20% | 62% | +42% |
| Ring | 50% | 80% | +30% |
| Snowman | 22% | 48% | +26% |
| Tetrahedral | 8% | 28% | +20% |
| Triple | 12% | 30% | +18% |
| Unseen Avg. | 20.0% | 43.2% | +23.2% |
Qualitative Results
Each card shows full-width trial clips for the same object, preserving the native wide aspect ratio of the qualitative videos.
Several limitations remain in the current system. First, the overall success rate is 27%—encouraging for a challenging tactile-only setting, but modest in absolute terms. Second, the hardware provides incomplete tactile coverage: contacts in unsensed regions of the hand produce ambiguous observations that can confuse the policy. Third, the fixed execution horizon does not always suffice for the policy to establish stable grasp configurations, sometimes leading to empty grasps or object drops during lifting.
This paper presents a complete pipeline for learning and deploying a blind grasping policy on a robotic dexterous hand. Our pipeline combines a tactile-enabled dexterous hand platform, a high-fidelity tactile simulation environment calibrated via Real2Sim, and an RL expert policy with a distillation framework for Sim2Real deployment. By jointly leveraging contact-event calibration, geometry-aware tactile representation learning, and diffusion-based policy aggregation, our system achieves promising results on a large and diverse real-world object set. Future work will focus on improving grasp robustness through denser tactile skins, explicit slip detection, and two-timescale reactive control architectures.