- Published on
AR/VR & Spatial Computing Development Complete Guide 2025: Unity, WebXR, Apple Vision Pro, Meta Quest
- Authors

- Name
- Youngju Kim
- @fjvbn20031
The XR Era: Computing's Third Revolution
The 1984 Macintosh popularized the graphical user interface. The 2007 iPhone opened the era of multi-touch mobile computing. In 2024, Apple Vision Pro formally kicked off the "spatial computing" paradigm.
By 2026 we are in the middle of computing's third big shift. Meta Quest 3 made mixed reality mainstream at 499 USD. Apple Vision Pro paired 8K micro-OLED displays with the R1 chip to deliver ultra-low-latency passthrough. Sony PSVR2 reset the bar for console VR through its pairing with PS5. WebXR enabled full XR content delivery with nothing more than a browser.
This guide starts with terminology (AR vs VR vs MR vs XR), surveys today's hardware, then dives into the major development platforms (Unity, Unreal, WebXR, visionOS, Meta XR SDK), input modalities (controllers, hand tracking, gaze), spatial mapping, performance optimization, motion sickness mitigation, and future trends.
One thing needs stating up front. XR development is not "making VR games." It is designing interfaces that live in space. Most 2D screen rules no longer apply. New constraints appear (motion sickness, battery, thermal, field-of-view) and so do new opportunities (spatial layout, hand gestures, reality blending).
Terminology: AR vs VR vs MR vs XR
| Term | Definition | Example |
|---|---|---|
| VR (Virtual Reality) | Full immersion in a virtual world | Quest 3 VR, Beat Saber |
| AR (Augmented Reality) | Digital overlay on reality | Pokemon GO, phone AR |
| MR (Mixed Reality) | Digital and real interact | HoloLens, Quest 3 MR, Vision Pro |
| XR (Extended Reality) | Umbrella for VR+AR+MR | Industry standard |
| Spatial Computing | Space itself is the interface | Apple Vision Pro marketing |
Apple prefers "spatial computing" and sidesteps AR/VR labels because Vision Pro is passthrough-first and blurs both modes. Meta markets "mixed reality" as the flagship Quest 3 capability.
The most important developer distinction is closed vs passthrough:
- Closed VR: outside world fully occluded. Games and simulations.
- Passthrough MR: cameras show the outside world with digital overlay. Productivity, education, industrial.
Hardware Landscape: 2026 Devices
Meta Quest 3 / 3S / Pro
- Quest 3 (499 USD): 4K+ display, Snapdragon XR2 Gen 2, color passthrough, hand tracking, 8GB RAM
- Quest 3S (299 USD): Quest 3 internals with Quest 2 display, entry price
- Quest Pro: eye + face tracking, enterprise
Meta Quest leads XR market share in 2026 on both price and content library.
Apple Vision Pro
- Vision Pro (3,499 USD): dual 4K micro-OLED, M2 + R1, 12ms motion-to-photon, EyeSight external display, gaze + pinch input
- visionOS: built on SwiftUI, RealityKit, and ARKit
Vision Pro is a developer-first premium device, not a mass-market product. Treat it as a stepping stone toward future AR glasses.
PSVR2
- PSVR2 (549 USD + PS5): OLED 4K, HDR, headset haptics, eye tracking, adaptive triggers
- Target: console gamers, AAA VR titles
HTC Vive / Pico / Varjo
- HTC Vive XR Elite: enterprise, passthrough
- Pico 4 Pro: strong in China/Asia
- Varjo XR-4: ultra-high resolution, simulation and defense
Why Specs Matter
XR hardware specs dictate your performance budget:
- Resolution: Quest 3 renders 2064x2208 per eye. At 90Hz that is 800M+ pixels per second.
- Refresh rate: 72/90/120Hz. Anything lower induces motion sickness.
- Field of view: 100-110 degrees horizontal is typical.
- GPU: Quest 3's XR2 Gen 2 is 5-10% of a desktop GPU.
- Battery/thermal: 2-3 hours, with throttling.
Key insight: XR has stricter budgets than mobile because everything is rendered twice (one eye per pass, unless using single-pass). Treat draw call and vertex budgets as halved.
Unity XR Toolkit: Cross-Platform King
Unity is the de facto standard for XR in 2026. It targets Quest, Vision Pro (via PolySpatial), HTC, Pico, and PSVR2.
Setup
- Install Unity 2022 LTS or newer from Unity Hub
- Install "XR Plugin Management" via Package Manager
- Project Settings to XR Plugin Management to Oculus/OpenXR
- Install "XR Interaction Toolkit"
XR Origin
XR Origin holds the player camera and controllers.
// Hierarchy
// XR Origin
// Camera Offset
// Main Camera (XR Camera)
// Left Controller (XR Controller)
// Direct Interactor / Ray Interactor
// Right Controller (XR Controller)
Simple Grab Interaction
using UnityEngine;
using UnityEngine.XR.Interaction.Toolkit;
public class PickupItem : MonoBehaviour
{
private XRGrabInteractable grabInteractable;
void Awake()
{
grabInteractable = GetComponent<XRGrabInteractable>();
grabInteractable.selectEntered.AddListener(OnGrabbed);
grabInteractable.selectExited.AddListener(OnReleased);
}
void OnGrabbed(SelectEnterEventArgs args)
{
Debug.Log("Grabbed");
}
void OnReleased(SelectExitEventArgs args)
{
Debug.Log("Released");
}
}
Attach XRGrabInteractable, a Rigidbody, and a Collider. That is enough to make the object grabbable.
Locomotion
VR locomotion is the primary trigger for motion sickness. Unity offers two standard approaches.
Teleport: point, click, jump. No sickness, less immersive.
// TeleportationProvider + TeleportInteractor setup
// Add TeleportationArea to ground meshes
Continuous: joystick moves smoothly. High immersion, possible sickness.
using UnityEngine;
using UnityEngine.XR.Interaction.Toolkit;
[RequireComponent(typeof(ContinuousMoveProviderBase))]
public class ComfortSettings : MonoBehaviour
{
public ContinuousMoveProviderBase moveProvider;
void Start()
{
moveProvider.moveSpeed = 2.0f;
}
}
Recommendation: default to teleport, expose continuous as an opt-in. Offer vignette and snap-turn.
World Space UI
// Canvas -> Render Mode -> World Space
// Scale Canvas to 0.01 (Unity 1m = 100px)
// Add TrackedDeviceGraphicRaycaster to Canvas
// EventSystem gets XR UI Input Module
Now hands or controllers can poke buttons in world space.
Unreal Engine VR: AAA Graphics
Unreal is the choice when photorealism matters. Meta Horizon, Lone Echo, Asgard's Wrath all shipped on Unreal.
VR Template
Unreal Editor -> New Project -> Games -> Virtual Reality Template
It ships with a VR Pawn, motion controllers, teleport, and a grab system.
Motion Controller Setup
UPROPERTY(VisibleAnywhere)
UMotionControllerComponent* LeftController;
UPROPERTY(VisibleAnywhere)
UMotionControllerComponent* RightController;
LeftController = CreateDefaultSubobject<UMotionControllerComponent>(TEXT("LeftController"));
LeftController->SetupAttachment(VROrigin);
LeftController->SetTrackingSource(EControllerHand::Left);
Blueprint vs C++
Blueprint is unbeatable for prototyping XR logic. AAA teams typically keep performance-critical paths in C++ and game logic in Blueprint.
WebXR: XR in the Browser
WebXR is a W3C standard API that runs in Chrome, Edge, Safari (iOS 17.4+), and the Meta Quest Browser. It is the only install-free way to deliver XR content.
Three.js + WebXR
import * as THREE from 'three';
import { VRButton } from 'three/addons/webxr/VRButton.js';
import { XRControllerModelFactory } from 'three/addons/webxr/XRControllerModelFactory.js';
const scene = new THREE.Scene();
const camera = new THREE.PerspectiveCamera(75, window.innerWidth / window.innerHeight, 0.1, 1000);
const renderer = new THREE.WebGLRenderer({ antialias: true });
renderer.setPixelRatio(window.devicePixelRatio);
renderer.setSize(window.innerWidth, window.innerHeight);
renderer.xr.enabled = true;
document.body.appendChild(renderer.domElement);
document.body.appendChild(VRButton.createButton(renderer));
const geometry = new THREE.BoxGeometry(0.2, 0.2, 0.2);
const material = new THREE.MeshStandardMaterial({ color: 0x44aa88 });
const cube = new THREE.Mesh(geometry, material);
cube.position.set(0, 1.5, -1);
scene.add(cube);
const light = new THREE.DirectionalLight(0xffffff, 1);
light.position.set(1, 1, 1);
scene.add(light);
const controller1 = renderer.xr.getController(0);
const controller2 = renderer.xr.getController(1);
scene.add(controller1);
scene.add(controller2);
const controllerModelFactory = new XRControllerModelFactory();
const controllerGrip1 = renderer.xr.getControllerGrip(0);
controllerGrip1.add(controllerModelFactory.createControllerModel(controllerGrip1));
scene.add(controllerGrip1);
renderer.setAnimationLoop(() => {
cube.rotation.y += 0.01;
renderer.render(scene, camera);
});
A-Frame: Declarative VR
<!DOCTYPE html>
<html>
<head>
<script src="https://aframe.io/releases/1.5.0/aframe.min.js"></script>
</head>
<body>
<a-scene background="color: #ECECEC">
<a-box position="-1 0.5 -3" rotation="0 45 0" color="#4CC3D9"></a-box>
<a-sphere position="0 1.25 -5" radius="1.25" color="#EF2D5E"></a-sphere>
<a-cylinder position="1 0.75 -3" radius="0.5" height="1.5" color="#FFC65D"></a-cylinder>
<a-plane position="0 0 -4" rotation="-90 0 0" width="4" height="4" color="#7BC8A4"></a-plane>
<a-camera>
<a-cursor></a-cursor>
</a-camera>
</a-scene>
</body>
</html>
A-Frame lets you author a VR scene in HTML. Minimal learning curve.
Babylon.js
Babylon.js is TypeScript-friendly with a mature toolchain. Microsoft is a major sponsor.
immersive-web working group
Specs and reference samples live at https://immersive-web.github.io. Raycasting, hit-test, anchors, and plane detection are the AR-critical features.
Apple Vision Pro: visionOS Development
visionOS Primer
Vision Pro extends iOS with new spatial primitives:
- Shared Space: multiple apps coexist
- Full Space: one app owns the environment
- Windows: 2D UI panels
- Volumes: 3D content containers
- Immersive Space: fully immersive environment
SwiftUI for Spatial
import SwiftUI
import RealityKit
@main
struct MyVisionApp: App {
var body: some Scene {
WindowGroup {
ContentView()
}
ImmersiveSpace(id: "ImmersiveSpace") {
ImmersiveView()
}
}
}
struct ContentView: View {
@Environment(\.openImmersiveSpace) var openImmersiveSpace
var body: some View {
VStack {
Text("Hello Vision Pro")
.font(.largeTitle)
Button("Enter Immersive") {
Task {
await openImmersiveSpace(id: "ImmersiveSpace")
}
}
.padding()
}
}
}
struct ImmersiveView: View {
var body: some View {
RealityView { content in
let cube = ModelEntity(
mesh: .generateBox(size: 0.2),
materials: [SimpleMaterial(color: .blue, isMetallic: false)]
)
cube.position = [0, 1.5, -1]
content.add(cube)
}
}
}
RealityKit
RealityKit is Apple's native 3D engine. It integrates with ARKit for anchors, occlusion, and physics.
import RealityKit
import ARKit
func setupARScene(content: RealityViewContent) async {
let session = ARKitSession()
let meshTracking = SceneReconstructionProvider()
do {
try await session.run([meshTracking])
for await update in meshTracking.anchorUpdates {
switch update.event {
case .added:
break
case .updated:
break
case .removed:
break
}
}
} catch {
print("ARKit session failed: \(error)")
}
}
Unity PolySpatial
Unity ships PolySpatial to target visionOS from existing Unity projects, preserving most of your Unity workflow.
Meta Quest Development: Meta XR SDK
Quest is Android under the hood. Build Android in Unity with the Oculus or OpenXR plugin.
Passthrough API
using UnityEngine;
using Oculus.Passthrough;
public class MRDemo : MonoBehaviour
{
public OVRPassthroughLayer passthroughLayer;
void Start()
{
passthroughLayer.hidden = false;
passthroughLayer.overlayType = OVROverlay.OverlayType.Underlay;
}
}
Set camera clear flags to solid color with alpha 0 and virtual objects are composited over the real world.
Scene Understanding
Quest 3 automatically identifies walls, floors, furniture.
using Meta.XR.MRUtilityKit;
public class RoomAnchor : MonoBehaviour
{
void Start()
{
MRUK.Instance.LoadSceneFromDevice();
}
void OnRoomLoaded(MRUKRoom room)
{
foreach (var anchor in room.Anchors)
{
Debug.Log($"Found {anchor.Label} at {anchor.transform.position}");
}
}
}
With this you can place virtual objects on real tables, or break through real walls in a game.
Input Modalities
6DoF Controllers
- Position plus rotation (6 degrees of freedom)
- Haptics, buttons, thumbsticks
- Meta Touch Plus, PSVR2 Sense, Valve Index
Hand Tracking
Camera-based tracking without controllers.
using UnityEngine;
using UnityEngine.XR.Hands;
public class HandTrackingExample : MonoBehaviour
{
private XRHandSubsystem handSubsystem;
void Start()
{
var subsystems = new System.Collections.Generic.List<XRHandSubsystem>();
SubsystemManager.GetSubsystems(subsystems);
if (subsystems.Count > 0) handSubsystem = subsystems[0];
}
void Update()
{
if (handSubsystem == null || !handSubsystem.running) return;
var leftHand = handSubsystem.leftHand;
if (leftHand.isTracked)
{
var indexTip = leftHand.GetJoint(XRHandJointID.IndexTip);
if (indexTip.TryGetPose(out var pose))
{
Debug.Log($"Left index tip at {pose.position}");
}
}
}
}
Gestures: pinch (thumb + index), fist, point. The Meta XR All-in-One SDK provides gesture detection APIs.
Eye Tracking
Vision Pro, Quest Pro, and PSVR2 ship with eye tracking. It serves two purposes:
- Input: look and pinch is Vision Pro's primary interaction model.
- Optimization: foveated rendering.
Vision Pro's UX is impossible without eye tracking. You select by looking at something and pinching your fingers.
Voice
Quest has Voice SDK, Vision Pro has Siri. Keyboard input is hard in XR, so voice is critical.
Spatial Mapping (SLAM)
SLAM (Simultaneous Localization and Mapping) reconstructs the environment from cameras and IMU. XR inside-out tracking is built on SLAM.
Plane Detection
using UnityEngine;
using UnityEngine.XR.ARFoundation;
using UnityEngine.XR.ARSubsystems;
public class PlaneDetector : MonoBehaviour
{
public ARPlaneManager planeManager;
void Start()
{
planeManager.planesChanged += OnPlanesChanged;
}
void OnPlanesChanged(ARPlanesChangedEventArgs args)
{
foreach (var plane in args.added)
{
Debug.Log($"New plane: {plane.alignment} at {plane.center}");
}
}
}
Mesh Generation
Quest 3 and Vision Pro build live 3D meshes of the room. This enables occlusion (hiding virtual objects behind real ones) and physics collisions.
Anchors
Anchors pin virtual objects to physical locations. They can persist across sessions and be shared between users.
Rendering Optimization
Foveated Rendering
Uses eye tracking to render only the fovea area at full resolution. The periphery runs at lower resolution. Vision Pro, Quest Pro, and PSVR2 support it, and it saves 30-50% GPU work.
Fixed Foveated Rendering (FFR)
Without eye tracking, the center of the display stays full resolution and peripheral areas render lower. Quest 2/3 rely on FFR.
Single-Pass Stereo Rendering
Instead of rendering the left and right eyes separately, single-pass rendering issues one draw call for both eyes. It halves the draw call count.
In Unity:
Project Settings -> XR Plug-in Management -> Oculus -> Stereo Rendering Mode -> Multiview
Draw Call Batching
Unity automatically batches objects sharing the same material. Leverage Static Batching and GPU Instancing aggressively.
Polygon Budgets
Quest 3 recommendations:
- Polygons per frame: 500K-1M
- Draw calls: 100-200
- Textures: 1024x1024 max, ASTC compression
- Shaders: Unlit or Mobile Lit
Motion Sickness Prevention
VR sickness is the biggest challenge in XR. The root cause is vestibular mismatch: your eyes see motion while your inner ear senses none.
Mitigation Techniques
- Maintain 90Hz or higher. Even a single dropped frame is noticeable.
- Prefer teleport over continuous movement.
- Vignette during motion (darken peripheral vision).
- Snap-turn (45 degree increments).
- Visible virtual nose for grounding.
- Short sessions with 20-minute breaks.
- Show the user's hands - visible avatars increase comfort.
Frame Rate Is Survival
90Hz means 11ms per frame. Use Unity Profiler and OVR Metrics Tool continuously to catch drops.
Haptic Feedback
Controller Haptics
using UnityEngine;
using UnityEngine.XR.Interaction.Toolkit;
public class HapticsExample : MonoBehaviour
{
public XRBaseController controller;
public void Vibrate()
{
controller.SendHapticImpulse(0.5f, 0.2f);
}
}
Haptic Patterns
- Tap: short, strong pulse (button press)
- Rumble: sustained vibration (collision)
- Texture: frequency variation (surface feel)
Meta Haptics Studio supports authoring haptic waveforms.
Spatial Audio
Sound should appear to come from correct 3D directions. Unity supports Oculus Audio SDK, Resonance Audio, Steam Audio.
// AudioSource Spatial Blend 1.0 (3D)
// Audio Listener attached to Main Camera
// Enable HRTF for directional cues
Ambisonics is the format for 360-degree sound, used in cinematic VR and live events.
Use Cases
- Gaming: Beat Saber, Half-Life Alyx, Horizon Worlds
- Training: medical surgery, military, welding
- Education: virtual anatomy, historical simulation
- Healthcare: PTSD therapy, pain management, rehabilitation
- Productivity: Vision Pro Mac Virtual Display, Horizon Workrooms
- Social: VRChat, Rec Room
- Industrial: BIM review, remote collaboration, digital twins
Future Trends (2026-2030)
- AR glasses: Meta Orion, Snap Spectacles, rumored Apple glasses
- Neural interfaces: Neuralink, Meta EMG wristbands
- Higher resolution: 8K per eye by 2027
- Lighter weight: sub-100g wearables
- AI-native content: LLM-driven NPCs, procedural worlds
- Digital twins: live 1:1 factory mirrors
Quiz
Q1. What is Foveated Rendering and why does it matter?
Foveated rendering uses eye tracking so only the area the user fixates on is rendered at full resolution; the periphery is rendered lower. Human vision resolves detail only in the fovea, so users do not notice the reduced peripheral quality. It saves 30-50% of GPU work, which is enormous for XR's tight budgets.
Q2. Teleport vs continuous locomotion - which is safer?
Teleport is much safer. VR sickness stems from visual motion with no vestibular motion. Teleport eliminates the mismatch entirely. Continuous movement is more immersive but triggers sickness in a significant fraction of users. The standard recommendation is to default to teleport and offer continuous as a toggle.
Q3. What is the core difference between Quest 3 and Vision Pro?
Price is the headline (499 vs 3499 USD), but the deeper split is input. Quest 3 is controller-first with hand tracking as backup. Vision Pro has no controllers; gaze + pinch is the primary input. Vision Pro also has much higher resolution and better passthrough. Quest is a gaming device; Vision Pro is a productivity device.
Q4. How does Single-Pass Stereo Rendering improve performance?
Traditional VR renders each eye separately, doubling draw calls and CPU work. Single-pass renders both eyes in one draw call, roughly halving CPU overhead. Unity exposes it as Multiview or Single Pass Instanced, and CPU-bound XR apps see significant gains.
Q5. WebXR trade-offs?
Pros: no install, cross-platform (Quest, Vision Pro, desktop VR), shareable by URL, low entry barrier. Cons: slower than native (complex shaders struggle), limited access to advanced features (hand tracking, anchors), weaker audio and haptics. Great for marketing, education, and light experiences; native is still required for AAA.
References
- Unity XR Documentation
- Unreal Engine VR
- WebXR Device API
- Three.js WebXR Examples
- A-Frame Documentation
- Apple visionOS Developer
- Meta Quest Developer
- OpenXR Specification
- ARKit Documentation
- Unity Meta XR SDK
- PSVR2 Developer
- Khronos - OpenXR Resources
- VR Best Practices - Oculus
- Foveated Rendering Research - NVIDIA