İçereği Atla
Volume 1

The Metric Backbone

Mathematical Foundations of Geometric Localization and Rigid Body Motion

Master the invisible geometry that powers the future of spatial computing.

Strategic Objectives

• Master the core linear algebra behind 3D rigid body transformations.

• Implement robust point cloud registration using state-of-the-art algorithms.

• Understand the probabilistic frameworks of EKF and Graph-based SLAM.

• Build the metric foundations required for seamless digital-physical integration.

The Core Challenge

Modern AR/VR and robotics fail without a precise, mathematically sound understanding of spatial coordinates and motion estimation.

01

The SLAM Paradigm

Simultaneous Localization and Mapping Explained
You will begin your journey by understanding the fundamental duality of SLAM: why you cannot map without a location and cannot locate without a map. This chapter establishes the high-level architecture that governs the rest of your technical deep dive.
The SLAM Duality
Understanding the Interdependence of Localization and Mapping

Explores the conceptual relationship between localization and mapping, emphasizing why both must be solved simultaneously and how errors in one propagate to the other.

Historical Context and Motivations
From Early Robotics to Modern SLAM

Provides an overview of the evolution of SLAM, highlighting key milestones, early challenges in robotic navigation, and the motivations for simultaneous estimation.

Core SLAM Architecture
High-Level Frameworks and System Components

Introduces the primary structural elements of a SLAM system, including sensors, state estimation, map representation, and feedback loops.

02

Foundations of Euclidean Space

Coordinates and Vector Calculus
Before you can track motion, you must master the stage where it occurs. You will revisit Euclidean space to ensure your mathematical intuition for 3D distances and coordinate systems is bulletproof for geometric modeling.
The Stage of Geometry
From Intuition to Formal Euclidean Structure

This section reframes three-dimensional space as a mathematical object rather than a visual intuition. It develops Euclidean space as a structured set equipped with distance, inner product, and linear structure, clarifying the assumptions that make geometric localization possible. Emphasis is placed on why flatness, parallelism, and rigid invariance form the backbone of classical motion modeling.

Coordinate Systems as Measurement Devices
Cartesian Frames and the Algebra of Position

Coordinates are introduced not as mere labels but as measurement instruments imposed on space. The section develops Cartesian coordinates, basis vectors, and the algebraic representation of points and displacements. It distinguishes between geometric objects and their coordinate descriptions, preparing the reader for transformations and frame changes.

Vectors as Directed Displacements
Linear Structure and the Geometry of Addition

This section formalizes vectors as elements of a linear space underlying Euclidean geometry. Vector addition, scalar multiplication, and linear combinations are connected directly to physical displacement and rigid body motion. The geometric meaning of span, linear independence, and dimensionality is interpreted through spatial reasoning rather than abstract symbolism.

03

Rigid Body Dynamics

Defining Motion Without Deformation
You need to treat objects and sensors as undeformable entities to simplify the SLAM problem. This chapter teaches you how to abstract complex physical shapes into rigid bodies for efficient mathematical tracking.
The Rigid Body Idealization
Why Localization Begins by Ignoring Deformation

Introduces the rigid body assumption as a deliberate mathematical abstraction rather than a physical truth. Explains how treating objects and sensors as undeformable entities collapses infinite internal degrees of freedom into a finite, tractable representation. Frames rigidity as the foundational simplification that makes geometric localization and SLAM computationally feasible.

Degrees of Freedom in Three-Dimensional Space
Translational and Rotational Components of Motion

Develops the six-degree-of-freedom structure of rigid motion in three-dimensional space. Separates translation from rotation and clarifies why no additional internal coordinates are required under rigidity. Connects these freedoms directly to pose estimation in localization problems.

Frames, Coordinates, and Relative Description
Describing the Same Body from Different Observers

Establishes the necessity of coordinate frames for representing rigid bodies. Explains body-fixed frames, inertial frames, and relative transformations. Demonstrates how motion is not an intrinsic property but a relation between frames—an insight central to multi-sensor SLAM systems.

04

The Rotation Matrix

Orthonormal Bases and SO(3)
You will learn how to represent orientation mathematically using Special Orthogonal groups. Mastering rotation matrices allows you to transform perspective between the world frame and the moving sensor frame without distortion.
Orientation as a Change of Basis
From Physical Rotation to Coordinate Transformation

This section reframes rotation not as spinning objects but as transforming coordinate descriptions between frames. Beginning with the distinction between a physical rigid body and its mathematical representation, it shows how orientation is encoded as a change of orthonormal basis. The world frame and the sensor frame are introduced as alternative coordinate lenses through which the same geometric reality is described, establishing rotation matrices as the algebraic bridge between them.

Constructing the Rotation Matrix
Columns as Orthonormal Basis Vectors

Here the rotation matrix is built explicitly from geometric principles. Each column is interpreted as a unit vector of the rotated frame expressed in the original frame. The orthonormality constraints are derived from metric preservation: dot products and lengths must remain invariant. This leads naturally to the conditions R^T R = I and det(R) = 1, grounding the matrix not in convention but in geometric necessity.

The Special Orthogonal Group SO(3)
Group Structure and Geometric Integrity

This section elevates rotation matrices from isolated tools to elements of a mathematical group. The defining properties of SO(3) are explored: closure under multiplication, existence of identity and inverses, and smooth manifold structure. The reader sees how composition of orientations corresponds to matrix multiplication and why inversion corresponds to transposition. The group viewpoint clarifies how successive rigid motions accumulate without distorting the metric backbone.

05

Quaternions in Spatial Tracking

Avoiding Gimbal Lock in 3D Space
You will explore the four-dimensional algebra of quaternions to solve the practical issues of gimbal lock and interpolation. This is critical for you to achieve smooth, continuous orientation tracking in AR/VR headsets.
From Euler Angles to Rotational Pathologies
Why Three Parameters Are Not Enough

This section revisits classical orientation representations—Euler angles and rotation matrices—and exposes their structural weaknesses in real-time tracking systems. It explains how gimbal lock emerges from coordinate singularities and why these singularities are unacceptable in continuous head tracking for AR/VR devices. The limitations of parameter redundancy, numerical instability, and interpolation artifacts are framed as practical engineering problems that demand a more robust algebraic structure.

The Four-Dimensional Algebra of Orientation
Understanding Quaternions as an Extension of Complex Numbers

This section introduces quaternions as a four-dimensional number system extending complex numbers. It develops the algebraic structure—scalar and vector parts, basis elements, and multiplication rules—and interprets quaternion multiplication as composition of rotations. The geometric meaning of unit quaternions is established, connecting four-dimensional algebra to three-dimensional spatial rotation.

Unit Quaternions and the Geometry of SO(3)
Mapping the 3-Sphere to Physical Rotations

Here the chapter connects unit quaternions to the topology of three-dimensional rotation space. It explains how the 3-sphere double-covers the rotation group and why antipodal quaternions represent the same physical orientation. The section emphasizes continuity, smooth parameterization, and the elimination of singularities, positioning quaternions as a metric-consistent representation within rigid body motion.

06

Homogeneous Coordinates

Unifying Rotation and Translation
You will simplify your computational pipeline by learning to treat 3D transformations as single matrix multiplications. This algebraic shorthand is what allows you to scale geometric calculations to real-time performance.
From Euclidean Triples to Projective Tuples
Why 3D Vectors Are Not Enough

This section reframes ordinary 3D coordinates as a special case of a broader algebraic structure. It explains the limitations of representing points as (x, y, z) when translations must be applied separately from rotations. By introducing the extra coordinate and interpreting points as equivalence classes under scaling, the reader sees how homogeneous representation prepares geometry for linear treatment.

The Geometry of the Extra Coordinate
Interpreting the w Component

Here the fourth coordinate is given geometric meaning rather than treated as a bookkeeping trick. The distinction between points (w ≠ 0) and directions (w = 0) is developed to unify position and displacement within one algebraic system. This distinction becomes central to rigid body motion, separating translation-sensitive entities from pure directional vectors.

Making Translation Linear
Embedding Affine Motion into Matrix Form

This section demonstrates the core computational breakthrough: by augmenting 3×3 rotation matrices into 4×4 matrices, translation becomes a matrix multiplication rather than a separate vector addition. The algebraic structure of affine transformations is expressed in block-matrix form, revealing how rotation and translation coexist inside a single operator.

07

Transformation Matrices

The SE(3) Lie Group
You will synthesize rotation and translation into a single framework. By understanding the SE(3) group, you gain the ability to chain motions together, representing a sensor's entire path through an environment.
From Isolated Motions to Unified Transformations
Why Rotation and Translation Must Share a Framework

This section motivates the transition from treating rotations and translations as separate operations to embedding them in a single algebraic structure. It revisits rotation matrices and vector translations as linear and affine mappings, highlighting their limitations when applied independently. The need for a unified representation emerges from practical problems in localization and rigid body motion, where chaining successive motions requires a consistent mathematical backbone.

Homogeneous Coordinates and the 4×4 Formulation
Embedding Euclidean Space into Projective Structure

Here the chapter introduces homogeneous coordinates as the mechanism that converts translations into matrix multiplication. By augmenting three-dimensional vectors with a fourth coordinate, both rotation and translation are encoded in a single 4×4 matrix. The structure of this matrix is dissected into its rotational block, translational column, and invariant bottom row, establishing the canonical representation of rigid transformations.

The Structure of SE(3)
Group Properties of Rigid Body Transformations

This section formally introduces the Special Euclidean Group in three dimensions, SE(3), as the set of all rigid body transformations. The group axioms—closure, associativity, identity, and inverse—are interpreted geometrically. The rotational component is identified with SO(3), while translations form a coupled vector space, together defining a nonlinear manifold with smooth composition laws.

08

Point Cloud Fundamentals

Representing the World as Discrete Samples
You will transition from abstract math to raw data by learning how sensors perceive the world. This chapter shows you how to handle the millions of data points that form the basis of geometric maps.
Introduction to Point Clouds
Why the World is Sampled Discretely

Introduce the concept of point clouds as discrete representations of the 3D world, contrasting continuous mathematical models with sensor-acquired data. Highlight the role of point clouds in mapping, robotics, and 3D reconstruction.

Acquisition Techniques
From LiDAR to Photogrammetry

Explain how different sensors capture point clouds, including LiDAR, structured light, and photogrammetry. Discuss the principles behind each method and the types of data they produce, emphasizing point density, noise, and coverage.

Data Structure and Organization
Managing Millions of Points Efficiently

Cover the common data structures for storing and indexing point clouds, including arrays, octrees, and voxel grids. Introduce concepts like spatial hashing and nearest-neighbor search to handle large datasets efficiently.

09

Lidar Principles

Active Ranging for Depth Acquisition
You need to understand the hardware that generates your data. You will study how light detection and ranging sensors provide the high-fidelity distance measurements necessary for sub-centimeter mapping accuracy.
Fundamentals of Light Detection and Ranging
Understanding the Core Physics

Introduces the basic principles of Lidar operation, including how laser pulses interact with surfaces to measure distance. Discusses the physics of light travel, reflection, and time-of-flight measurement essential for precise depth acquisition.

Lidar Sensor Architectures
From Mechanical to Solid-State Systems

Explores the different hardware designs for Lidar sensors, including rotating multi-beam systems, MEMS-based scanners, and solid-state Lidar. Examines trade-offs between resolution, range, and robustness in mapping applications.

Signal Processing and Distance Computation
Extracting Accurate Depth from Returns

Covers the methods used to convert raw laser reflections into distance measurements, including time-of-flight computation, waveform analysis, and noise filtering. Emphasizes techniques that enhance sub-centimeter accuracy in complex environments.

10

Iterative Closest Point

The Core of Geometric Registration
You will learn the 'workhorse' algorithm of geometric SLAM. This chapter teaches you how to align two different point clouds to estimate the motion that occurred between their capture.
Introduction to Point Cloud Alignment
The Need for Geometric Registration in SLAM

Explains the role of point cloud alignment in geometric SLAM, emphasizing why precise registration between successive scans is critical for estimating motion and building accurate maps.

The Iterative Closest Point Algorithm
Step-by-Step Mechanics

Breaks down the ICP algorithm into its iterative steps: correspondence estimation, transformation computation, and convergence checking, with visual examples and intuitive explanations.

Mathematical Formulation
Optimizing Rigid Body Motion

Presents the formal mathematical framework for ICP, including the objective function that minimizes mean squared distance between matched points and the derivation of optimal rotation and translation.

11

Least Squares Estimation

Optimizing for Minimum Error
You will discover how to handle the inevitable noise in sensor data. By applying least squares, you can find the 'best fit' transformation that explains your observations despite measurement inaccuracies.
Introduction to Least Squares
Understanding the Principle of Error Minimization

Explore the fundamental idea behind least squares: finding a solution that minimizes the sum of squared discrepancies between observed and predicted values. Establish why this approach is critical for noisy sensor data in geometric localization and rigid body motion problems.

Mathematical Formulation
Setting Up the Optimization Problem

Detail the mathematical expression of least squares, including linear and nonlinear forms. Explain how measurement vectors, transformation matrices, and residuals are defined within this context, preparing the reader for practical computations.

Analytical Solutions and Normal Equations
Direct Methods for Linear Systems

Present the derivation of the normal equations for linear least squares. Discuss the conditions under which an analytical solution exists and how it yields the optimal estimate of transformations in rigid body motion.

12

The Extended Kalman Filter

Recursive State Estimation
You will master the classic probabilistic approach to SLAM. This chapter shows you how to update your position and map estimates in real-time as new sensor data streams in.
Foundations of Recursive Estimation
From Bayesian Filtering to State Prediction

Introduce the probabilistic framework for state estimation. Explain the distinction between prior, posterior, and predictive distributions, emphasizing how sensor noise and system uncertainty propagate through time.

Linear Kalman Filter Essentials
Building Blocks Before Nonlinearity

Review the classical linear Kalman Filter as a foundation. Cover the predict-update steps, error covariance propagation, and measurement incorporation to prepare for nonlinear extensions.

Extending to Nonlinear Systems
Jacobian Linearization and Local Approximations

Explain how the Extended Kalman Filter handles nonlinear motion and measurement models. Introduce the Jacobian matrix as a local linear approximation, detailing its role in propagating uncertainty.

13

Normal Distributions Transform

Probabilistic Map Matching
You will explore an alternative to ICP that uses probability densities rather than raw points. This allows you to build smoother, more robust maps that are less sensitive to outliers and local minima.
Introduction to Probabilistic Map Alignment
Why point-based registration can fail

Introduce the limitations of traditional Iterative Closest Point (ICP) methods, emphasizing sensitivity to noise, outliers, and local minima. Motivate the need for probabilistic approaches that leverage density representations of spatial data.

Foundations of the Normal Distributions Transform
From points to continuous densities

Explain the core idea of representing discrete point clouds as a set of Gaussian distributions over a voxelized space. Discuss how this representation smooths spatial data and prepares it for probabilistic alignment.

Constructing the NDT Map
Voxelization, covariance, and density estimation

Detail the step-by-step procedure for building an NDT map from raw point cloud data: dividing space into cells, computing mean and covariance, and assembling the Gaussian mixture that represents the map.

14

Graph-Based SLAM

Poses, Constraints, and Optimization
You will learn the modern standard for SLAM. By representing the robot's path as a graph, you can perform global optimizations that correct the 'drift' that accumulates over long distances.
From Trajectories to Graphs
Modeling Robot Paths as Graph Structures

Introduce the conceptual shift from sequential pose estimation to representing the robot's trajectory as nodes in a graph, highlighting how this structure captures both the spatial layout and connectivity of observed landmarks.

Defining Constraints
Edges, Measurements, and Relative Poses

Explain how constraints between poses arise from odometry and sensor measurements, and how these relationships form edges in the graph, encoding relative transformations and uncertainties.

Formulating the Optimization Problem
From Error Functions to Global Cost Minimization

Detail the mathematical formulation of the graph optimization problem, introducing error functions, least-squares objectives, and the principles behind minimizing accumulated drift across the trajectory.

15

Loop Closure Detection

Recognizing Visited Locations
You will tackle the challenge of drift by learning how to recognize when you have returned to a previous location. This 'aha!' moment for the algorithm is what allows for consistent, non-overlapping maps.
Understanding Drift in Pose Estimation
Why cumulative errors accumulate in mapping

Introduce the concept of drift in sequential localization, explaining how small errors in position and orientation compound over time, leading to map distortions. Discuss the mathematical representation of drift in rigid body motion and geometric maps.

The Concept of Loop Closure
Recognizing previously visited locations

Define loop closure in geometric localization. Explain how detecting a previously visited location can reset accumulated drift, creating globally consistent maps. Introduce examples in robotics and computer vision.

Feature-Based Recognition Methods
Using landmarks and descriptors

Discuss methods for identifying locations using distinctive features, including visual, geometric, and semantic landmarks. Explore descriptors, feature matching, and the mathematical tools for quantifying similarity between candidate locations.

16

Covariance Matrices

Quantifying Uncertainty
You must understand not just where you are, but how sure you are of that fact. This chapter teaches you to track the error margins of your geometric estimates, which is vital for safety-critical applications.
Foundations of Covariance
Understanding Statistical Dependencies

Introduce the mathematical definition of covariance, explore how variables interact, and establish the foundational role of covariance in tracking uncertainty in geometric measurements.

Covariance Matrices in Multi-Dimensional Spaces
From Scalars to Matrices

Explain how covariance extends from single-variable cases to vectors and matrices, emphasizing geometric interpretations such as ellipsoids representing uncertainty in position and orientation.

Eigenvalue Decomposition and Uncertainty Geometry
Axes of Confidence

Discuss how eigenvalues and eigenvectors of covariance matrices reveal principal directions of uncertainty, enabling visualization of error ellipsoids and understanding anisotropic measurement errors.

17

Odometry Systems

Relative Motion via Dead Reckoning
You will examine the integration of wheel or visual encoders to provide a baseline for your SLAM system. This chapter shows you how to use internal motion data to assist your external geometric mapping.
From Global Maps to Local Increments
Why Dead Reckoning Remains Foundational

This section reframes odometry not as a primitive fallback, but as the metric backbone of geometric localization. It contrasts absolute positioning with incremental motion estimation, establishing dead reckoning as the mechanism that propagates pose in the absence of external references. The mathematical notion of relative transformation in SE(2) and SE(3) is introduced as the natural language of odometric integration.

Kinematic Models of Wheeled Motion
Encoder Counts to Continuous Pose

This section develops the forward kinematics of common mobile robot bases, including differential drive and car-like steering. Encoder ticks are translated into arc lengths and incremental rotations, which are then lifted into rigid body transformations. Emphasis is placed on the geometric interpretation of curvature, instantaneous centers of rotation, and discrete-time integration.

Visual Odometry as Geometric Self-Motion
From Pixel Displacement to Rigid Transformation

Here the chapter expands odometry beyond wheels to camera-based motion estimation. The section explains how feature correspondences, epipolar geometry, and perspective projection enable incremental pose recovery. Monocular scale ambiguity, stereo baselines, and depth inference are analyzed as constraints that shape the metric reliability of visual dead reckoning.

18

The Bundle Adjustment Technique

Refining 3D Structures and Camera Poses
You will dive into the heavy-duty optimization used in high-end AR. Bundle adjustment allows you to refine your entire geometric map simultaneously, ensuring that every point and pose is as accurate as possible.
From Local Estimates to Global Consistency
Why Pairwise Geometry Is Not Enough

This section motivates bundle adjustment as the natural culmination of geometric localization. After incremental pose estimation and triangulation, accumulated drift and inconsistent feature geometry inevitably appear. We reframe the reconstruction problem as a single coupled system in which all camera poses and 3D points must satisfy the same metric constraints simultaneously, establishing the conceptual leap from local optimization to global refinement.

The Reprojection Error as a Metric Objective
Formulating the Nonlinear Least Squares Problem

Here we derive the bundle adjustment objective from first principles of perspective projection. The section formalizes how 3D points and rigid body poses generate predicted image measurements, and how discrepancies with observed pixels define a nonlinear least squares cost. Emphasis is placed on parameterizing rotations, translations, and structure in a way consistent with the book’s metric treatment of rigid motion.

Linearization and Iterative Refinement
Gauss–Newton, Levenberg–Marquardt, and Trust in Curvature

Because the reprojection objective is nonlinear, it must be solved iteratively. This section explains how first-order linearization yields normal equations, and how Gauss–Newton and Levenberg–Marquardt balance convergence speed with numerical stability. Special attention is given to conditioning, damping strategies, and the geometric interpretation of curvature in pose–structure space.

19

Voxelization and Occupancy Grids

Efficient Spatial Partitioning
You will learn how to organize point cloud data into grid-based structures. This makes your maps searchable and usable for collision detection and path planning in real-time.
From Continuous Space to Discrete Cells
Quantizing Euclidean Geometry

This section introduces voxelization as the discretization of continuous three-dimensional space into uniform volumetric elements. It connects the mathematical structure of Euclidean space with integer lattice indexing, explaining how metric coordinates are mapped into grid coordinates. Emphasis is placed on resolution selection, quantization error, and the trade-off between geometric fidelity and computational tractability.

The Voxel as a Metric Primitive
Topology, Adjacency, and Neighborhood Structure

Here the voxel is treated not merely as a cube but as a topological unit within a discrete manifold. The section analyzes connectivity models (6, 18, and 26 neighbors), adjacency graphs, and how these discrete relationships approximate continuous proximity. These structures form the basis for collision checking and graph-based path planning.

Point Clouds to Occupancy Fields
Statistical Encoding of Spatial Evidence

This section explains how raw point cloud measurements are integrated into occupancy grids. It covers deterministic and probabilistic occupancy models, sensor fusion, and the accumulation of evidence over time. The mathematical framing emphasizes Bayesian updates and how uncertainty propagates across grid cells.

20

Non-linear Least Squares

Optimizing Complex Cost Functions
You will move beyond linear assumptions to solve the actual, messy math of 3D geometry. This chapter provides the iterative numerical methods required to converge on the truth in a non-linear world.
When Geometry Refuses to Be Linear
Why Real-World Localization Breaks Closed-Form Solutions

This section explains why rigid body motion, perspective projection, and multi-sensor fusion naturally produce non-linear residuals. It contrasts linear least squares with non-linear formulations arising in pose estimation and structure reconstruction, emphasizing how rotations, depth, and reprojection errors introduce curvature into the cost landscape.

The Geometry of Error Surfaces
Cost Landscapes, Curvature, and Local Minima

Develops intuition for objective functions as high-dimensional surfaces shaped by measurement noise and parameter coupling. Readers explore gradients, Hessians, conditioning, and the role of curvature in determining convergence behavior. Special attention is given to the distinction between convex and non-convex regions in geometric estimation problems.

From Taylor Expansion to Iterative Refinement
Linearizing the Non-linear at Each Step

Introduces the core idea of iterative linearization using first-order Taylor expansion. The non-linear residual function is locally approximated, transforming each iteration into a linear least squares problem. The derivation of the normal equations for the incremental update is framed explicitly in the context of pose and motion refinement.

21

The Future of Geometric Maps

Metric Foundations for the Metaverse
You will conclude by looking at the broader impact of your work. Having mastered the geometric foundations, you are now prepared to build the spatial maps that will define the next generation of human-computer interaction.
From Coordinate Frames to Persistent Worlds
Extending Rigid Body Geometry into Shared Spatial Substrates

This section reframes spatial mapping as the natural extension of rigid body kinematics and metric reconstruction into persistent, shared environments. It connects coordinate frames, transformations, and pose estimation to the construction of durable geometric worlds. Emphasis is placed on how local metric consistency scales into globally coherent spatial substrates suitable for long-term interaction.

Metric Integrity at Planetary Scale
From Local SLAM to Distributed Geometric Consensus

Here the chapter explores how simultaneous localization and mapping evolves into large-scale, networked spatial systems. It examines loop closure, global optimization, and distributed map fusion as problems of metric consensus across devices and users. The focus is on maintaining geometric invariants under scale, latency, and asynchronous observation.

Semantic Structure on a Metric Backbone
Embedding Meaning Without Sacrificing Geometry

This section investigates how semantic labels, object models, and scene understanding can be layered atop metric maps without eroding geometric rigor. It distinguishes between metric truth and semantic interpretation, showing how object recognition, surface classification, and scene graphs depend fundamentally on stable geometric reconstruction.

Available eBook Editions

Arabic
English
French
German
Italian
Japanese
Korean
Portuguese
Spanish
Turkish