Computer graphics deals with images, stationary or that are moved around. Their scale is changed.
Three dimensions are projected onto two dimensions. All the main operations are done by
matrices—but the shape of these matrices is surprising.
In Computer Graphics, matrices are used to represent many different types of data.
Games that involve 2D or 3D graphics rely on some matrix operations to display the game environment and characters in game.
The transformations of three-dimensional space are done with 4-by-4 matrices. You would expect 3-by-3. The reason for the change is that one (translation) of the six key operations
cannot be done with a 3-by-3 matrix multiplication. Here are the six operations:
Translation (shift the origin to another point P₀(x₀, y₀, z₀)).
Rescaling (by c in all directions or by different factors
c₁, c₂, c₃).
Shearing (in different directions).
Rotation (around an axis through the origin or an axis through P₀).
Reflection (with respect to some hyperplane).
Projection onto a plane through the origin or a plane through P₀).
Translation
Translation is the easiest—just add (x₀, y₀, z₀) to every point. But this is not linear! No 3×3 matrix can move the origin. So we change the coordinates of the origin to (0, 0, 0, 1).
This is why the matrices are 4-by-4. The “homogeneous coordinates” of the point (x, y, z) are (x, y, z, 1) and we now show how they work.
Translation shifts the whole three-dimensional space along the vector v₀. The origin moves to ((x₀, y₀, z₀). This vector v₀ is added to every point v in ℝ³. Using homogeneous coordinates, the 4-by-4 matrix T shifts the whole space by v₀:
Since Computer Graphics works with row vectors, but most mathematical subjets use column vectors, we try to please both parties (almost impossible task). Therefore, we utilize different notations for matrices as operators. If a matrix acts from left on column vectors, w write it in brackets; otherwise we embrace it in parentheses when row vectors are multiplied by matrix from right. Correspondingly, the first matrix in the formula above is considered as as an operator acting on column vectors, whereas the latter acts on row vectors from right: [0, 0, 0, 1]T = [x₀, y₀, z₀, 1].
Example 1:
■
End of Example 1
Rescaling
Scaling is used to make a picture fit a page, we change its width and height. Another example provides a copier by rescaling a figure by 85%. In linear algebra, we achive this by multiplying the identity matrix by scalar 0.85.
That matrix is normally 2-by-2 for a plane and 3-by-3 for a solid. In computer graphics,
with homogeneous coordinates, the matrix becomes one size larger:
S is not cI (constant times the identity matrix). We keep the “1” in the lower corner. Then [x, y, z, 1] times S is the
correct answer in homogeneous coordinates. The origin stays in its normal position because
[0 0 0 1]S = [0 0 0 1].
If we change that 1 to c, the result is strange. The point (cx, cy, cz, c) is the same
as (x, y, z, 1). The special property of homogeneous coordinates is that multiplying by cI does not move the point. The origin in ℝ³ has homogeneous coordinates (0, 0, 0, 1) and
(0, 0, 0, c) for every nonzero c. This is the idea behind the word “homogeneous.”
Scaling can be different in different directions. To fit a full-page picture onto a half-page, scale the y direction by ½. To create a margin, scale the x direction by 3/4. The
graphics matrix is diagonal but not 2-by-2. It is 3-by-3 to rescale a plane
That last matrix S rescales the x, y, z directions by positive numbers c₁, c₂,c₃. The extra
column in all these matrices leaves the extra 1 at the end of every point.
In certain instances, it may be required that an object has to be resized in the same coordinate system.
The axes are then resized to different scales. As mentioned before, to scale a 2D point in the x
direction by Sx and in the y direction by Sy we require transforming it as:
3D models have several uses in computer assisted design (CAD) for engineering
purposes. Using advanced scanning techniques with the help of MRI, 3D models of organs can
be created that can help diagnose and assist in treatments of patients. In other medical uses, 3D
models of proteins can help cancer research.
There are many
3D models that are often used in movies and TV
shows to represent characters as well as objects. These objects can
be made realistic with advanced modeling
techniques.
The scaling 4×4 matrix S is the same size as the affine translation matrix T. They can
be multiplied. To translate and then rescale, multiply v T S. To rescale and then translate,
multiply v S T, which is different from v T S, enerally speaking.
Example 2:
■
End of Example 2
Shearing
A typycal matrix that perform shearing in two-dimensional space is
The following two animations demonstrate how a shear matrix effects a unit square when 𝑎 increases first follows by c change (left figure). Another animation shows transformations of unit square when oth parameters 𝑎 and b increase.
Example 3:
■
End of Example 3
Rotation
Regular 2D rotation (by angle θ around the origin in counterclockwise direction) matrix \( \displaystyle \quad \begin{bmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{bmatrix} \quad \) has the following affine counterpart:
The affine matrix above describles a rotation in two-dimensional case around the origin in counterclockwise direction. This means that multiplication by this matrix a vector (from left or from right) results a new vector (either column or row) rotated by amount of θ radians. However, if you need to define a rotation around an arbitrary point P₀(x₀, y₀), you need to tanslate P₀ to (0, 0), then rotate by θ, then translate (0, 0) back to (x₀, y₀):
In three dimensions, every rotation R(n, θ) turns around an axis, which we identify with a unit vector n along the line of rotation in ℝ³. The axis doesn’t move—it
is a line of eigenvectors with λ = 1. Suppose the axis is in the z=direction. The 1 in matrix R(n, θ) is to
leave the z-axis alone, the extra 1 in affine matrix R(n, θ) is to leave the origin alone:
Reflection matrices, also known as mirror matrices are not elements of SO(n) because their determinant is −1. These orthogonal matrices constitute subset of O(n) and they can be expressed as
since nTn = [1], whicj is 1-by-1 matrix with single entry of "1." Recall that nT ∈ ℝ1×n is a row vector and n ∈ ℝn×1 is a column vector.
We start with the folloing illustrative example of two-dimensional case.
Example 7:
We start with a smple case when line goes through the origin:
\[
L\ : \quad x - 2\,y = 0 .
\]
Clearly, vector [1, −2] is orthogonal to L. It is convenient to work with unit vectors, so we introduce the normal one:
\[
\hat{\bf n} = \frac{1}{\sqrt{5}} \left[ 1 \ -2 \right]^{\mathrm T} = \frac{1}{\sqrt{5}} \begin{bmatrix} 1 \\ -2 \end{bmatrix} .
\]
Its orthogonal vector goes along line L,
\[
\hat{\bf n}^{\perp} = \frac{1}{\sqrt{5}} \left[ 2 \ 1 \right]^{\mathrm T} = \frac{1}{\sqrt{5}} \begin{bmatrix} 2 \\ 1 \end{bmatrix} .
\]
Using this unit normal vector n, we build the projection matrix
\[
{\bf P} = {\bf I} - \hat{\bf n}\,\hat{\bf n}^{\mathrm T} = {\bf I} - \frac{1}{\sqrt{5}} \begin{bmatrix} 1 \\ -2 \end{bmatrix} \cdot \frac{1}{\sqrt{5}} \begin{bmatrix} 1 & -2 \end{bmatrix} = \frac{1}{5} \begin{bmatrix} 4&2 \\ 2&1 \end{bmatrix} .
\]
P = IdentityMatrix[2] - {{1, -2}, {-2, 4}}/5
{{4/5, 2/5}, {2/5, 1/5}}
P . P == P
True
Let us take a vector that has a projection that is easy to verify: v = (0, 1). Since line L makes angle of 30° with positive abscissa. Therefore,
\[
{\bf v}_{\|} = \left( {\bf v} \bullet \hat{\bf n}^{\perp} \right) \hat{\bf n}^{\perp} = \frac{1}{5} \left[ 2 \ 1 \right]^{\mathrm T} = \frac{1}{5} \begin{bmatrix} 2 \\ 1 \end{bmatrix} .
\]
v = {0, 1}; np = {2, 1}/Sqrt[5];
Dot[v, np]
1/Sqrt[5]
and
\[
{\bf v}_{\perp} = \left( {\bf v} \bullet \hat{\bf n} \right) \hat{\bf n} =
-\frac{2}{\sqrt{5}}\,\hat{\bf n} , \qquad {\bf v}_{\|} = \frac{1}{5} \begin{bmatrix} -2 \\ 4 \end{bmatrix} .
\]
We verify with Mathematica upon multiplcation of the projection matrix and given vector v
P = IdentityMatrix[2] - {{1, -2}, {-2, 4}}/5;
P . {0, 1}
{2/5, 1/5}
Therefore, projection of vector [0, 1] on line L is the vector of length 1/2:
\[
{\bf v}_{\|} = \frac{1}{5} \begin{bmatrix} 2 & 1 \end{bmatrix}^{\mathrm T} .
\]
■
End of Example 7
In homogeneous coordinates, the projection matrix becomes 4 by 4 (but the origin doesn’t move):
The matrix P gave a “parallel” projection. All points move parallel to n, until they
reach the plane. The other choice in computer graphics is a “perspective” projection. This
is more popular because it includes foreshortening. With perspective, an object looks larger
as it moves closer. Instead of staying parallel to n (and parallel to each other), the lines of
projection come toward the eye—the center of projection. This is how we perceive depth
in a two-dimensional photograph.
Now we want to project a vector onto a plane n • x = b, where n is unit vector perpendicular to the plan, x = (x₁, x₂, x₃) ∈ ℝ³, and b = (b₁, b₂, b₃) ∈ ℝ³.
This plane does not go through the origin, but through the vector b ≠ 0.
The projection onto the flat (n • x = b, which is a plane going through any point other than the origin) has three steps. Translate b to the origin by matrix T−. Project
along the n direction, and translate back along the row vector b. A projection matrix is symmetric, but transition matrices T− and T+ depend on whether vectors are written as rowss or as columns. Since computer graphics works with roews, we demontrarate this approach first:
The matrix P gave a “parallel ” projection. All points move parallel to n, until they
reach the plane. The other choice in computer graphics is a “perspective ” projection. This
is more popular because it includes foreshortening. With perspective, an object looks larger
as it moves closer. Instead of staying parallel to n (and parallel to each other), the lines of
projection come toward the eye—the center of projection. This is how we perceive depth
in a two-dimensional photograph.
========================== to be checked ==============
The second process in 3D graphics is called animation. This process defines relationships
between 3D objects in a three dimensional space over time. This can be done through many
different methods such as key frames, inverse kinematics and motion capture. Motion capture is
the modeling of a 3D animation by using sensors or cameras to capture the motion of an object
or a person in the real world. Inverse kinematics is a powerful tool when developing games or
movies that makes it possible to calculate the precise positions for a joint system so it will
eventually reach a certain goal. This is done in movies to capture facial expressions of actors to
be used in animation to depict those expressions on animated characters. For example, in Pirates
of the Caribbean movie, Davy Jones facial expressions was modeled with his face tentacles in
the movie with the help of motion capture of the actor’s face. Similarly, inverse kinematics is a
process where the path of an object can be used
by partial information or data from some other
source. A known application of this process is in
Robotics where this process is employed to
calculate the trajectory needed for robot’s limb
in order to successfully perform a maneuver or a
task. Therefore this process requires a
sophisticated application of Linear Algebra. If
you look at the picture that was discovered on a
recent blog you can see that Mario is jumping with a velocity of (1,3). You can see that he is
moving pretty fast upwards and to the right with an acceleration of (0,-1). In the game the player
normally uses an analog to control the left and right movement of the character. Then the player
would press some sort of button for the character to jump. This is a perfect example to show you
how games use vector addition and subtraction to calculate the overall velocity and position of
the player.
The third process in 3D graphics is called 3D rendering. 3D rendering makes use of a 3D
wire frame model of an object or multiple objects to produce an animated scene or a 2D image
from the scene. This is done through the application of two operations. First operation is
transport, which means how much light is being shone on to the surface being rendered, from
what direction this light is coming, and how intense the light source is. The second operation is
called scattering which determines how the surface is being rendered interacts with light. These
two operations are required to render 3D models in CAD animation, or even physics and weather
animations etc. Although in this animation setting, such as some video games, movies and
advertisements, other sophisticated techniques such as god rays or scanline rendering are used to
improve the quality.
Rigid Body Transformations
A rigid body transformation is one that changes the location and orientation of an object, but not its shape. All angles, lengths, areas, and volumes
are preserved. Translation and rotation are the only rigid body transformations. Reflection is not considered a rigid body transformation.
All rigid body transformations are orthogonal, angle-preserving, invertible,
and affine. Rigid body transforms are the most restrictive class of transforms, but they are also extremely common in
practice.
The determinant of any rigid body transformation matrix is 1.
■
In conclusion, Linear Algebra is used in many different ways in computer graphics. The
mathematical structure of computer graphics takes advantage of many operations and theorems
in Linear Algebra to assist in 2D/3D models, in animations and in rendering.
There are many uses for 3D wireframes such as viewing the model from any angle or even using
it to analyze the distances between the edge and corners. This technique has been in use for
almost as long as computer displays have been around. It became noticed in late 1970’s and early
1980’s when it was used in computer games. Around the same time, 3D object rendering started
being used in movies to depict objects. Making of 3D graphics can be divided into the following
different processes:
First is the modeling of a surface of an object into a representation of a collection of
points in 3D space. This process is done with the help of a modeling software, or by using
specialized 3D scanners. However, the end result is always a collection of points in 3D space.
These points are vectors in Linear Algebra, on which the processes of Linear Algebra such as
transformation, rotation and scaling can be applied.
2D Affine Transformations
Matrix Representations of
2D Affine Transformations include extra dimension. Matrices are in brackets when they considered as operators acting on column vectors from left. Matrices are embraced in parentheses when they multiply row vectors from right.
Translation:
\[
\mathbf{T} = \begin{bmatrix} 1&0&\Delta x \\
0&1&\Delta y \\
0&0&1 \end{bmatrix} = \begin{pmatrix} 1&0&0 \\
0&1&0 \\
\Delta x & \Delta y &1 \end{pmatrix}^{\mathrm T} .
\]
Shearing is a transformation that skews the coordinate space; it is
usually achieved by adding multiple of one coordinate to the other. This is obtained by matrix multiplication (from left or from right):
Example 18:
http://igg.unistra.fr/People/seo/2%20affine%20transformation.pdf
Let us consider an inhabited set L of 𝔸² consisting of all
points (x, y) satisfying the equation
\[
2\,x + y -3 = 0 .
\]
The set L is the line of slope −2 passing through the points (3/2, 0) and (0, 3)
shown in Figure below.
The line L can be made into an official affine space by defining the action d : L × ℝ ↦ L by
\[
\left( x, 3 - 2\, x , \right) + v = \left( x+ v, 3 - 2\,x - 2\,v \right) .
\]
This action transfers L into an affine space. For example, for any two points P(𝑎, 3 −2𝑎) and Q(b −2b) on L, here is a unique (vector) u ∈ ℝ such that Q = P + u. It is clear that u = b − 𝑎. Note that the vector space ℝ is isomorphic to the line of equation 3x + y = 0 passing through the origin.
Similarly, the inhabited set H of 𝔸³ consisting of all
points (x, y, z) satisfying the equation
\[
2\, x + 3\,y + z - 6 = 0 .
\]
The set H is the plane passing through the points (3, 0, 0), (0, 2, 0), and (0, 0, 1). The plane H can be made into an official affine space by
defining the action d : H × ℝ² defines by
\[
\left( x, y, 6 - 2\,x - 3\,y \right) + \begin{bmatrix} u \\ v \end{bmatrix} = \left( x + u , y + v , 6 - 2\,x -2\,u - 3\,y -3\,v \right)
\]
for any point (x, y, 6 −2x −3y) of H and any vector [u, v]T ∈ ℝ².
The affine matrix for corresponding transformation is
????
■
End of Example 18
Example 19:
We reconsider an affine transformation from Example 2:
\[
\begin{bmatrix} x \\ y \end{bmatrix} \,\mapsto \, \begin{bmatrix} 1 & 1 \\ 0 & 2 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} + \begin{bmatrix} \phantom{-}2 \\ -1 \end{bmatrix} ,
\]
with matrix and free term
\[
{\bf A} = \begin{bmatrix} 1 & 1 \\ 0 & 2 \end{bmatrix} \qquad \mbox{and} \qquad {\bf b} = \begin{bmatrix} \phantom{-}2 \\ -1 \end{bmatrix} ,
\]
respectively. We demonstrate Mathematica's capabilitues to handle an affine transformation.
A = {{1, 1}, {0, 2}};
b = {2, -1};
t = AffineTransform[{A, b}]
Earlier, way back at the top of this page, we began a passage with "One might reasonably ask: 'What does the last line mean? What does the row containing {0.,0.|1.}, of the TransformationFunction do?'" We put off until now a more complete answer to that question. Before, what followed was a particular answer from the ChatBot representing Wolfram's Artificial Intelligence module in a Chat-Enabled notebook. Below we ask that same question of the same AI bot and this time we get:
The last row "0 0 1" of a 3x3 transformation matrix is a part of the homogeneous coordinate representation used in affine transformations.
In homogeneous coordinates, a 2D point (x, y) is represented as (x, y, 1) and a 2D vector is represented as (x, y, 0). The reason for this is to allow for translations to be represented as matrix multiplications.
The "0 0 1" row in the transformation matrix ensures that when this matrix is multiplied with a point represented in homogeneous coordinates, the 1 in the third component of the point stays a 1. This is necessary to keep the point a point, and not a vector.
In contrast, if the transformation matrix is multiplied with a vector, the 0 in the third component of the vector stays a 0, ensuring that the vector remains a vector and is not translated.
So, the "0 0 1" row is essentially a part of the mathematical machinery that allows for points and vectors to be treated differently by affine transformations, particularly translations. It doesn't directly affect the scaling, rotation, shear or translation applied to the points or vectors.
Now we are in preparation to use a special Mathematica command
FindGeometricTransform, which is suitable for image processing.
Below we have two images of a man recorded at different points in time. We want to know how many errors he has made in the time that elapsed between these two images.
image1 =
image2 = --->
We are not surprised that he has made very few errors
If we want to find out how tilted he has become, we apply the GeometricTransform Mathematica found to the first image to see he has become a little off level over the years.
Example 20:
http://igg.unistra.fr/People/seo/2%20affine%20transformation.pdf
■
End of Example 20
Example 21:
■
End of Example 21
Rotation: In its most general form, rotation is defined to take place about some fixed point.
We will consider the simplest case where the fixed point is the origin of the coordinate frame.
Example 22:
■
End of Example 22
3D Affine Transformations
Now, we can extend all of previously discussed ideas to 3D in the following way. First, we convert all 3D points to homogeneous coordinates of point P(x, y, z), written in either row form or column form:
\[
\left[ x\, : \, y \, : \, z \, : \, 1\right] \in \mathbb{R}^{1 \times 4}
\qquad \mbox{or} \qquad \begin{bmatrix} x \\ y \\ z \\ 1 \end{bmatrix} \in \mathbb{R}^{4 \times 1} .
\]
The following matrices constitute the basic affine transforms in 3D, expressed in homogeneous form:
Reflection in 3-space is given a plane, and flips
points in space about this plane. In this case, reflection is just a special case of scaling,
but where the scale factor is negative. A common simple version of this is when the plane
about which the reflection is performed is one of the coordinate planes (corresponding
to x = 0, y = 0, or z = 0).
For example, to reflect points about the xz-coordinate plane (that is, the plane y = 0),
we can scale the y-coordinate by −1. Using the scaling matrix above, we
have the following transformation matrix:
The cases for the other two coordinate frames are similar.
Example 24:
■
End of Example 24
Rotation: In its most general form, rotation is defined to take place about some fixed vector in space ℝ³.
We will consider the simplest case where the fixed vector is one of the coordinate axes. There are three basic rotations: about the x, y and z-axes. In each case, the rotation
is counterclockwise through an angle θ (given in radians). The rotation is assumed to
be in accordance with a right-hand rule: if your right thumb is aligned with the axes of
rotation, then positive rotation is indicated by the direction in which the fingers of this
hand are pointing. To produce a clockwise rotation, simply negate the angle involved.
Consider a rotation about the z-axis. The z-unit vector and origin are unchanged.
The x-unit vector is mapped to (cos θ, sin θ, 0, 0), and the y-unit vector is mapped to (− sin θ, cos θ, 0, 0). Thus the rotation matrix is:
A rotation by angle θ about an arbitrary axis
can be decomposed into the concatenation
of rotations about the x, y, and z axes (see section on Euler's theorem).
Example 27:
Let us consider a line that goes through point P(7, 11, -5), which is parallel to the vector w = (3, 1, 8). We want to rotate the point Q(6, -9, 15) about this line through angels θ that are multiples of 5° until we are back at the point Q. Each rotation through a fixed angle θ is one application of an affiner map. The plot of these points is essentially the circle of rotation for Q. The formula for the rotated point Qnew is given by
■
End of Example 27
Chaku, S., Bhatnagar, A., 2D Transformations Analyzed by Both Column
Vector and Row Vector Synthesis,
International Journal of Engineering and Advanced Technology (IJEAT), VISSN: 2249 – 8958, olume-9 Issue-3, February 2020
Gortler, S.J., Foundations of 3D Computer Graphics (Mit Press) , 2012.
Hughes, J.F., van Dam, A., McGuire, M., Sklar, D., Foley, J.D., Feiner, S.K., Akeley, K., Computer Graphics. Principoles and Practice, Third edition, Addison-Wesley, Upper Saddle River, NJ. 2014.