Skip to content

Depth Cameras

Practical

Depth cameras measure the distance from the sensor to every point in the scene, producing a 2.5D representation of the environment. They are essential for obstacle avoidance, manipulation, 3D mapping, and any task requiring spatial understanding beyond what 2D images provide.

Prerequisites

Three Depth Sensing Technologies

Stereo Vision

Uses two cameras separated by a known baseline to calculate depth via triangulation.

[Left Cam]───baseline───[Right Cam]
│ │
└─────────┬─────────────┘
│ Disparity Matching
[Depth Map]

How it works: The same point appears at different horizontal positions in each image. This difference (disparity) is inversely proportional to depth.

Pros: Works outdoors, long range, passive operation possible Cons: Needs texture for matching, computationally intensive

Structured Light

Projects a known IR pattern and analyzes its distortion on objects.

[IR Projector]───pattern──►[Object]
distortion
◄────────[Camera]
triangulation
[Depth Map]

How it works: Known patterns (dots, stripes, or coded patterns) deform when hitting surfaces at different depths. A camera captures this distortion and computes depth via triangulation.

Pros: High accuracy at short range, works on textureless surfaces Cons: IR pattern washed out in sunlight, limited range

Time-of-Flight (ToF)

Measures the time for IR light to travel to an object and back.

[Emitter]────IR pulse────►[Object]
│ │
│ reflection
│ │
└◄────────────────────────┘
▼ measure time/phase
[Depth Map]

How it works: Emits modulated IR light and measures either phase shift (continuous wave) or direct time delay (pulsed) to calculate distance.

Pros: Fast capture, works in low light, texture-independent Cons: Multipath interference, limited outdoor range

Technology Comparison

AspectStereoStructured LightToF
Range0.5–20m0.3–5m0.2–8m
AccuracyMediumHigh (short range)Medium
SpeedMediumMediumFast
OutdoorGoodPoorModerate
Texture neededYesNoNo
CostLow–MediumMediumMedium–High

Intel RealSense (Active Stereo)

ModelBaselineRangeFeatures
D45595mm0.4–6m86° FoV, up to 90fps
D45695mm0.4–6mIP65-rated outdoor
D4050.1–0.5mShort-range manipulation
D435i50mm0.3–3mBuilt-in IMU, compact

Stereolabs ZED (Stereo + Neural Depth)

ModelFeatures
ZED 2iIP66, 12cm baseline, 0.2–20m, neural depth
ZED SDK 5.1Isaac ROS NITROS support, 10x lower latency

Orbbec

ModelTechnologyFeatures
Femto BoltToFAzure Kinect replacement, 120° FoV
Femto MegaToFBuilt-in compute, PoE support
Gemini 335LgActive StereoMobile robot optimized, indoor/outdoor

ROS 2 Integration

Standard Topics

/camera/depth/image_raw # Depth image (32FC1 meters or 16UC1 mm)
/camera/depth/camera_info # Calibration parameters
/camera/depth_registered/points # Colored point cloud
/camera/color/image_raw # RGB image

Launch Examples

Terminal window
# Intel RealSense with point cloud
ros2 launch realsense2_camera rs_launch.py \
enable_depth:=true \
depth_module.profile:=640x480x30 \
pointcloud.enable:=true
# ZED camera
ros2 launch zed_wrapper zed_camera.launch.py \
camera_model:=zed2i
# Orbbec Femto Bolt
ros2 launch orbbec_camera femto_bolt.launch.py

Depth Subscriber Example

import rclpy
from rclpy.node import Node
from sensor_msgs.msg import Image
from cv_bridge import CvBridge
class DepthSubscriber(Node):
def __init__(self):
super().__init__('depth_subscriber')
self.subscription = self.create_subscription(
Image, '/camera/depth/image_raw',
self.depth_callback, 10)
self.bridge = CvBridge()
def depth_callback(self, msg):
# Convert to numpy array (meters, 32-bit float)
depth_image = self.bridge.imgmsg_to_cv2(
msg, desired_encoding='32FC1')
# Get depth at center pixel
h, w = depth_image.shape
center_depth = depth_image[h // 2, w // 2]
self.get_logger().info(f'Center depth: {center_depth:.3f}m')
def main():
rclpy.init()
node = DepthSubscriber()
rclpy.spin(node)
rclpy.shutdown()

Isaac ROS GPU Acceleration

Isaac ROS 3.2 provides GPU-accelerated depth processing:

PackagePurpose
isaac_ros_stereo_image_procDisparity computation
isaac_ros_depth_image_procDepth to point cloud
isaac_ros_dnn_stereo_depthDNN-based depth estimation

DNN Stereo Models

ESS (Efficient Semi-Supervised):

  • Fast inference for real-time applications
  • Outputs disparity + confidence map
  • Robust to unseen environments

FoundationStereo (new in Isaac ROS 3.2):

  • Transformer-based foundation model
  • Zero-shot generalization across scenes
  • Uses Depth Anything V2 as feature extractor
  • Best Paper Nomination at CVPR 2025

Depth Processing Pipeline

┌─────────────┐ ┌──────────────┐ ┌───────────────┐
│ Depth │ → │ Undistort │ → │ Register │
│ Sensor │ │ (calibrate) │ │ to RGB │
└─────────────┘ └──────────────┘ └───────────────┘
┌─────────────┐ ┌──────────────┐ ┌───────────────┐
│ Obstacle │ ← │ Point Cloud │ ← │ Depth to │
│ Detection │ │ (PCL2) │ │ 3D Points │
└─────────────┘ └──────────────┘ └───────────────┘

Choosing the Right Technology

Use CaseRecommended
Outdoor navigationStereo (passive or active)
Indoor manipulationStructured light or ToF
AMR obstacle avoidanceToF (fast, compact)
3D scanningStructured light (high accuracy)
Low-light environmentsToF
Long-range perceptionStereo

Sources