Point Cloud
A point cloud is a collection of 3D points — each one a location (x, y, z) in space representing a position on a surface or object. Often each point also carries additional channels: color (r, g, b), return intensity, or a surface normal vector. Point clouds are the native output of LiDAR sensors and depth cameras, and the primary format for 3D spatial understanding in robotics.
Where a camera image tells you what the scene looks like, a point cloud tells you where things are in three dimensions.
How Point Clouds Are Generated
| Source | Method | Typical density |
|---|---|---|
| LiDAR | Laser time-of-flight | Sparse (64–128 beams), long range (100m+) |
| RGB-D camera | Structured light or time-of-flight | Dense (~300K points/frame), short range (5m) |
| Stereo camera | Disparity from two images | Dense, no active illumination, outdoor use |
| Photogrammetry | Multiple RGB images + Structure from Motion | Dense, offline, no depth sensor needed |
LiDAR produces sparse but long-range and highly accurate measurements, making it the sensor of choice for outdoor navigation. RGB-D cameras produce dense but shorter-range clouds, better suited for indoor manipulation where you need fine geometry near the robot.
Point Cloud Data Structure
Each point is a small record of floating-point and integer fields:
Each point: x, y, z (float32) — 3D position in meters intensity (float32) — LiDAR return strength (optional) r, g, b (uint8) — color from RGB-D (optional) nx, ny, nz (float32) — surface normal vector (optional)
Stored as: PCL PointCloud<PointXYZI>, ROS sensor_msgs/PointCloud2, .pcd filesThe sensor_msgs/PointCloud2 ROS message is the standard interchange format — it stores the binary point data with a field schema describing which channels are present and their types.
Key Operations
Point clouds are rarely used raw. A standard processing pipeline applies several operations before the data is useful:
- Downsampling (voxel grid): divide space into a 3D grid, keep one point per voxel. Reduces cloud density by 10–100x for faster downstream processing, typically 10–50ms on CPU.
- Filtering: remove statistical outliers (single noisy returns far from neighbors), crop to a region of interest (e.g. 5m radius around robot), or pass-through filter by height to remove the floor.
- Normal estimation: compute a surface normal vector at each point by fitting a plane to nearby neighbors. Surface normals are required for grasp pose planning — they tell the gripper which direction to approach.
- Registration (ICP — Iterative Closest Point): align two point clouds by minimizing point-to-point distance. Used in LiDAR SLAM to register each new scan against the accumulated map.
- Object segmentation: separate foreground objects from the dominant background plane (e.g. table surface) using RANSAC plane fitting followed by Euclidean cluster extraction.
In SLAM
LiDAR SLAM algorithms (LOAM, LIO-SAM) build the environment map as a point cloud, registering each new scan against the accumulated map using ICP-variant algorithms. The output is a globally consistent 3D map that the robot can localize against.
For camera-based depth, nvblox converts RGB-D frames into a TSDF (Truncated Signed Distance Field) — a volumetric representation built by fusing point cloud data over time. The TSDF supports both mesh generation (for visualization) and occupancy grid extraction (for Nav2 path planning).
In Manipulation
Point clouds are central to robotic grasping:
- Detect the target object in the point cloud (by color, shape, or learned features)
- Estimate surface normals at candidate grasp locations
- Score grasp poses based on normal alignment, contact area, and collision clearance
- Plan the gripper approach trajectory
Libraries for point cloud manipulation:
- PCL (Point Cloud Library): C++ library with filters, segmentation, registration, and feature estimation. The standard for ROS-based robotics.
- Open3D: Python/C++ library with similar capabilities plus mesh processing and neural point cloud methods. Easier to prototype with.
Related Terms
Sources
- Point Cloud Library (PCL) — Reference documentation for filters, segmentation, registration, and feature estimation
- Open3D — Python and C++ point cloud and mesh processing library with tutorials
- Isaac ROS nvblox — GPU-accelerated TSDF fusion from depth camera point clouds on Jetson