Stereo Vision
Stereo vision is a method of determing the 3D location of objects in a scene by comparing
images of two seperate cameras. Now suppose you have some robot on Mars and he sees an alien
(at point P(X,Y)) with two video cameras. Where does the robot need to drive to run over this alien
(for 20 kill points)?
First lets analyze the robot camera itself. Although a simplification resulting in minor error,
the pinhole camera model will be used in the following examples:
The image plane is where the photo-receptors are located in the camera, and the lens
is the lens of the camera. The focal distance is the distance between
the lens and the photo-receptors (can be found in the camera datasheet).
Point P is the location of the alien, and point p is where the alien
appears on the photo-receptors. The optical axis is the direction the camera is pointing.
Redrawing the diagram to make it mathematically simpler to understand,
we get this new diagram
with the following equations for a single camera:
x_camL = focal_length * X_actual / Z_actual
y_camL = focal_length * Y_actual / Z_actual
CASE 1: Parallel Cameras
Now moving on to two parallel facing cameras (L for left camera and R for right camera), we have this diagram:
The Z-axis is the optical axis (the direction the cameras are pointing). b is
the distance between cameras, while f is still the focal length.
The equations of stereo triangulation (because it looks like a triangle) are:
Z_actual = (b * focal_length) / (x_camL - x_camR)
X_actual = x_camL * Z_actual / focal_length
Y_actual = y_camL * Z_actual / focal_length
CASE 2a: Non-Parallel Cameras, Rotation About Y-axis
And lastly, what if the cameras are pointing in different non-parallel directions? In this below diagram,
the Z-axis is the optical axis for the left camera, while the Zo-axis is the optical axis of the right camera.
Both cameras lie on the XZ plane, but the right camera is rotated by some angle phi. The point
where both optical axes (plural for axis, pronounced ACKS - I) intersect at the point (0,0,Zo) is called the fixation point.
Note that the fixation point could also be behind the cameras when Zo < 0.
calculating for the alien location . . .
Zo = b / tan(phi)
Z_actual = (b * focal_length) / (x_camL - x_camR + focal_length * b / Zo)
X_actual = x_camL * Z_actual / focal_length
Y_actual = y_camL * Z_actual / focal_length
CASE 2b: Non-Parallel Cameras, Rotation About X-axis
calculating for the alien location . . .
Z_actual = (b * focal_length) / (x1 - x2)
X_actual = x_camL * Z_actual / focal_length
Y_actual = y_camL * Z_actual / focal_length + tan(phi) * Z
CASE 2c: Non-Parallel Cameras, Rotation About Z-axis
For simplicity, rotation around the optical axis is usually dealt with by rotating the image before applying matching and triangulation.
Given the translation vector T and rotation matrix R describing the transormation from
left camera to right camera coordinates, the equation to solve for stereo triangulation is:
where p and p' are the coordinates of P in the left and right camera coordinates respectively, and RT is the transpose (or the inverse) matrix of R.
Please continue on in the Computer Vision Tutorial Series
for Part 4: Computer Vision Algorithms for Motion.