Controlling the Vision Pro AR/VR Headset: 6 Hand Gestures for Navigating visionOS

The Vision Pro shifts the paradigm of how people navigate and interact with software in mixed reality. Instead of requiring clunky physical controllers, Apple’s AR/VR headset relies on sophisticated input mechanisms we’re all born with—our hands and eyes.

Make selections, scroll, zoom, rotate, move things around, and more just by looking at things and performing simple hand gestures reminiscent of Multi-Touch on the iPhone.


Follow along to explore the six basic hand gestures used throughout visionOS and learn how your gaze and hands make using Apple’s spatial computer seamless and effortless.

How Does Vision Pro’s Hand Gestures Work?

Hand gestures in visionOS work in tandem with eye tracking. Your eyes provide targeting and intent, and gestures trigger specific actions. The Vision Pro features front, side, and bottom-mounted cameras for hand tracking, and infrared sensors inside the headset track your eyes.


Woman sitting on a couch, wearing Apple's Vision Pro headset and performing a gesture with hands in her lap
Image Credit: Apple

You look at an onscreen item—like a button—to highlight it or hold your gaze to expand the underlying contextual menu or reveal a tooltip. With the selection made, perform a gesture like pinching to act on it. Because the external cameras cover a wide field of view, you can perform gestures while resting your hand comfortably in your lap to avoid strain.

Apple has outlined six common types of hand gestures for visionOS in a WWDC23 developer talk about designing hand and eye interactions for spatial input. Coupled with eyes for targeting, these hand gestures let you control apps and interactions across the system.



Illustration showing how to perform six basic Vision Pro hand gestures
Image Credit: Apple

Pinching your fingers together will perform the default action on the selected item. Think of pinching in terms of tapping the iPhone’s screen or clicking your Mac’s mouse or trackpad. Some visionOS features, such as air typing on the virtual keyboard, don’t require the tap gesture.

Double Tap

Pinch your thumb and index finger together twice in quick succession to perform the equivalent of the double-tap gesture in touch-first operating systems like iOS and iPadOS.

Pinch and Hold

Bring your thumb and index finger together and hold them momentarily to trigger this gesture. Pinch and hold in visionOS works similarly to the long-press (tap and hold) gesture on your iPhone for highlighting text or bringing up the contextual menu.

Move your pinched fingers to move windows and other UI elements in visionOS. If Safari is in front of you, you can also hold your hand in the air and just flick your finger to scroll.

Pinch and Drag

The pinch and drag gesture is used for scrolling. Just bring your fingers together and flick your wrist. You can scroll up, down, left, or right by dragging your hand in any direction.

The faster you move your hand, the faster you’ll scroll. You can scroll webpages in Safari, emails in Mail, documents in Pages, spreadsheets in Excel, etc. You can use the pinch and drag gesture to scroll content in any window on visionOS that supports scrolling.


For zooming in Safari, Photos, and other apps, pinch the thumb and index fingers on both hands together, then move each outwards. You can use this gesture to bring a window in visionOS farther or closer, enlarge images, and more. In the Photos app, you can look at a specific part of an image when performing the zoom gesture to zoom in on it.


To rotate an item, pinch your fingers with both hands close together, then move one hand upwards and the other downwards. You can use this gesture to rotate 3D objects in Quick Look, images in the Photos app, illustrations in documents, and so on.

Eye and Hand Tracking “Just Works” on the Vision Pro

Like on the iPhone, Apple provides a basic gestural language for navigating the entirety of the visionOS user interface and augmented reality apps. Third-party developers can leverage Apple’s official APIs for visionOS to create and add gestural input to their apps.

Apple doesn’t recommend creating custom gestures that clash with the system ones or those that aren’t easy to understand, perform, or memorize.

Eye and hand tracking in virtual, augmented, and mixed reality applications are not new features. However, the Vision Pro headset implements these features in Apple’s trademark manner. As a result, navigating visionOS with your eyes and hands is seamless and effortless.


🧪 |Medical Laboratory Scientist 🥇 | Mindset over Everything. 
 🤝 | Let's Grow Together.

Related Articles

Back to top button