We are often interested not just in applying
linear functions to vectors, but also in combining vectors from different vector
spaces together in a linear way.
Suppose V and U are vector spaces over the same field 𝔽. Then a function f : V × U ⇾ 𝔽 is called a bilinear form if it satisfies the following properties:
It is linear in its first argument:
f(v₁ + v₂, u) = f(v₁, u) + f(v₂, u) for all v₁, v₂ ∈ V and ∀ u ∈ U;
f(kv, u) = kf(v, u) for all scalars k and ∀ v ∈ V, ∀ u ∈ U.
It is linear in its second argument:
f(v, u₁ + u₂) = f(v, u₁) + f(v, u₂) for all
v ∈ V and ∀ u₁, u₂ ∈ U;
f(v, ku) = kf(v, u) for all scalars k and ∀ v ∈ V, ∀ u ∈ U.
While the above definition might seem like a mouthful, it simply says that
f is a bilinear form exactly if it becomes a linear form (or functional) when one of its inputs is held constant. That is, for every fixed vector u ∈ U the function g : V ⇾ 𝔽 defined by gu(v) = f(v, u) is a linear form, and similarly for every fixed vector
v ∈ V the function hv : U ⇾ 𝔽 defined by hv(u) = f(v, u) is a linear form.
Example 1:
Let us consider polar coordinates on the plane ℝ²:
\[
x = r\,\cos\theta , \quad y = r\,\sin\theta ,\qquad -\pi < \theta \le \pi ,
\]
or in complex plane ℂ:
\[
z = r\,e^{{\bf j}\,\theta} ,
\]
where j is the imaginary unit on the complex plane, so j² = −1. So every point (x, y) on the plane ℝ² is identified by the pair of real numbers (r, θ). However, this transformation from Cartesian coordinates (x, y) to polar coordinates (r, θ) is not linear because
\[
\left( r_1 , \theta_1 \right) + \left( r_2 , \theta_2 \right) \ne \left( r_1 + r_2 , \theta_1 + \theta_2 \right) .
\]
However, this transition to polar coordinates is not a bilinear form because
\[
\left( r_1 , \theta \right) + \left( r_2 , \theta \right) \ne \left( r_1 + r_2 , \theta \right) .\quad\mbox{and} \quad \left( r , \theta_1 \right) + \left( r , \theta_2 \right) \ne \left( r , \theta_1 + \theta_2 \right) .
\]
■
End of Example 1
The dot product is the prototypical example of a bilinear form. Perhaps even more simply, notice that multiplication (of
real numbers) is a bilinear form. That is, if we define a function f : ℝ × ℝ → ℝ
simply via f(x, y) = xy, then f is a bilinear form. This of course makes sense
since multiplication of real numbers is just the one-dimensional dot product.
One of the most significant applications of the bilinear forms is establishing a relationship between them and dual spaces. Its generalization for infinite dimensional vector spaces is known as Riesz representation theorem.
Let V be a finite-dimensional vector space over a field 𝔽, and let
V ′ (also denoted as V✶) be its dual space. Then function g : V ′ × V ↦ 𝔽 defined by
Indeed, from the definition of addition and scalar multiplication of functions, we observe that function g is linear in each of its input arguments individually. For the first input argument, we have
for all ϕ ∈ V ′, ∀ v, u ∈ V, and arbitrary scalar k because each ϕ, ψ ∈ V ′ is linear functional.
Example 2:
Let z₁ = 1 − 2j and z₂ = 3 + j be two complex numbers, where j is the imaginary unit vector so that j² = −1. We consider the mapping f : ℂ² × ℂ² ↦ ℂ defined by
\[
f\left( \mathbf{v}, \mathbf{u} \right) = z_1 v_1 u_2 + z_2 v_2 u_1 ,
\tag{1.1}
\]
where v = (v₁, v₂) ∈ ℂ² (where v₁, v₂ are complex numbers) and u = (u₁, u₂) ∈ ℂ² (where u₁, u₂ are complex numbers).
This function (1.1) is a bilinear form because it is linear in each argument.
■
End of Example 2
We are going to use the dot product operation to define a bilinear form in another way. Let A ∈ 𝔽m×n be a numerical m-by-n matrix over field 𝔽 (which could be either ℝ or ℂ). We define the bilinear form f : 𝔽m × 𝔽n ↦ 𝔽, defined by
Once again, we just check the defining properties from definition of the bilinear form, all of which follow straightforwardly from the corresponding properties
of matrix multiplication:
For all v₁, v₂ ∈ 𝔽m×1, u ∈ 𝔽n, and arbitrary scalar k ∈ 𝔽, we have
In fact, we now show that every bilinear form acting on finite-dimensional
vector spaces can be written in this way. Just like matrices can be used to
represent linear transformations, they can also be used to represent bilinear
forms.
Theorem 1:
Let α and β be ordered bases of m- and n-dimensional vector spaces V and U, respectively, over a field 𝔽, and let f : V × U → 𝔽 be a bilinear form. Then there exists a unique matrix A ∈ 𝔽m×n such that
\[
f(\mathbf{v}, \mathbf{u}) = [\![ v ]\!]_{\alpha} \bullet {\bf A}\, [\![ \mathbf{u} ]\!]_{\beta} \qquad \forall \mathbf{v} \in V, \ \forall\mathbf{u} \in U,
\]
where ⟧v⟦α and ⟧u⟦β are coordinate vectors of v and u with respect to bases α and β, respectively.
We just use the fact that bilinear forms are linear when we keep one
of their inputs constant, and we then leech off of the representation of linear
forms that we already know from Theorem 7
in section.
Specifically, if we denote the vectors in the basis α by α = { v₁, v₂, … , vm },
then [vj]α = ej for all 1 ≤ j ≤ m and Theorem 1.3.3 tells us that the linear form
gj : U ⇾ 𝔽 defined by gj(u) = f(vj, u) can be written as gj(u) = aj • [u]β
for some fixed (column) vector aj ∈ 𝔽n×1. If we let A be the matrix with rows a₁, a₂, … , am (i.e., AT = [a₁] [a₂] … [am]), then
\begin{align*}
F(\mathbf{v}_j , \mathbf{u}) &= g_j (\mathbf{u}) = {\bf a}_j \bullet \left[ \mathbf{u} \right]_{\beta}
\\
&= \mathbf{e}_j \bullet {\bf A} \left[ \mathbf{u} \right]_{\beta} \left[ \mathbf{v} \right]^{\mathrm T}_{\alpha} \bullet {\bf A} \left[ \mathbf{u} \right]_{\beta}
\\
& \quad \mbox{for all} \quad 1 \le j \le m, \quad \mathbf{u} \in U.
\end{align*}
To see that this same equation holds when we replace vj by an arbitrary v ∈ V, we just use linearity in the first argument of f and the fact that every v ∈ V can be written as a linear combination of the basis vectors from α (i.e., v = c₁v₁ + c₂v₂ + ⋯ + cmvm for some scalars c₁, c₂, …, cm ∈ 𝔽):
\begin{align*}
f(\mathbf{v}, \mathbf{u}) &= f \left( \sum_{1\le j \le m} c_j \mathbf{v}_j , \mathbf{u} \right)
\\
&= \sum_{1\le j \le m} c_j f\left( \mathbf{v}_j , \mathbf{u} \right) = \sum_{1\le j \le m} c_j \left[ \mathbf{v} \right]_{\alpha} \bullet {\bf A} \left[ \mathbf{u} \right]_{\beta}
\\
&= \left( \sum_{1\le j \le m} c_j \mathbf{e}_j \right) \bullet {\bf A} \left[ \mathbf{u} \right]_{\beta} = \left[ \mathbf{v} \right]_{\alpha} \bullet {\bf A} \left[ \mathbf{u} \right]_{\beta}
\end{align*}
for all v ∈ V and u ∈ U.
Finally, to see that A is unique we just note that if β = { u₁, u₂, … , un },
then f(vi, uj) = [vi]α • A[uj]β = ei • Aej = ai, j for all 1 ≤ i ≤ m, and 1 ≤ j ≤ n, so the entries of matrix A are completely determined by f.
As one particularly interesting example of how the above theorem works,
we recall that the determinant is multilinear in the columns of the square matrix it acts on, and thus in particular it can be interpreted as a bilinear form on 2 × 2 matrices. More specifically, if we define a function f : 𝔽2×1 × 𝔽2×1 → 𝔽 via
then f is a bilinear form and thus can be represented by a 2 × 2 matrix.
Example 4:
Find the matrix A ∈ 𝔽2×2 with the property that
\[
\det \left[ \mathbf{v} \mid \mathbf{u} \right] = \mathbf{v} \bullet {\bf A}\,\mathbf{u} \quad\mbox{for all} \quad \mathbf{v}, \mathbf{u} \in\mathbb{F}^{2\times 1} .
\]
Solution: Recall that det[ v ∣ u] = v₁u₂ − v₂u₁ while direct calculation show that
\[
\mathbf{v} \bullet {\bf A}\,\mathbf{u} = a_{1,1} v_1 u_1 + a_{1,2} v_1 u_2 + a_{2,1} v_2 u_1 + a_{2,2} v_2 u_2 .
\]
By simply comparing these two expressions, we see that the unique matrix
A that makes them equal to each other has entries 𝑎1,1 = 0, 𝑎1,2 = 1, 𝑎2,1 = −1, and 𝑎2,2 = 0. That is,
\[
\det \left[ \mathbf{v} \mid \mathbf{u} \right] = \mathbf{v} \bullet {\bf A}\,\mathbf{u} \quad\mbox{if and only if} \quad {\bf A} = \begin{bmatrix} \phantom{-}0&1 \\ -1&0 \end{bmatrix} .
\]
■
End of Example 4
It is natural to extend the notion of a bilinear form to arbitrary mapping since it is critical for understanding of duality in linear algebra.
If U, V, and W are vector spaces over the same field of scalars, then a mapping
\[
f\,:\ U\times V \mapsto W
\]
is called bilinear if it is linear in each variable when the other variable is held fixed.
This means that if we hold u fixed, then v ↦ f(u, v) is linear, so it belongs to the space of all linear transformations from V into W; this vector space is usually denoted as ℒ(V, W) or Hom(V, W). If we hold the second argument fixed, then similarly f(u, v) is in ℒ(U, W) = Hom(U, W) as a function of u.
This is not the same notion as linearity on the product vector
space U × V. For example, (x, y) ↦ x + y is a linear mapping from ℝ²
to ℝ, but it is not bilinear. If y is held fixed, then the mapping x ↦ x + y is affine (translation through y), but it is not linear unless y is 0. On the other hand, (x, y) ↦ xy is a bilinear mapping from ℝ² to ℝ, but it is not linear. If y is held fixed, then the mapping x ↦ xy is linear. However, the sum of two ordered couples does not map to the sum of their images:
\[
\left( x, y \right) + \left( u, v \right) = \left( x + u, y + v \right) \, \mapsto \ \left( x + u \right) \left( y+v \right) ,
\]
which is not the sum of the images xy + uv.
Theorem 2:
If f : U × V ↦ W is bilinear, then, by duality, f is equivalent to a linear mapping from U to Hom(V, W) = ℒ(V, W) and also to a linear mapping
from V to Hom(V, W).
For each fixed v ∈ V, let &psiv be the mapping
u ↦ ψ(u, v). That is, ψv(u) = ψ(u, v). Then &psiv is a linear transformation from U into W by the bilinear hypothesis. The mapping v ↦ &psiv is a function from V to Hom(U, W) and its linearity is due to the linearity of ψ in v whenu is held fixed:
\begin{align*}
\psi_{c\mathbf{u}+d\mathbf{v}} (\mathbf{a}) &= \psi (\mathbf{a}, c\mathbf{u}+d\mathbf{v}) = c\,\psi (\mathbf{a}, \mathbf{u}) + d\,\psi (\mathbf{a}, \mathbf{v})
\\
&= c\,\psi_{u} (\mathbf{a}) + d\,\psi_v (\mathbf{a}) ,
\end{align*}
so that
\[
\psi_{c\mathbf{u}+d\mathbf{v}} (\mathbf{a}) = c\,\psi_{u} (\mathbf{a}) + d\,\psi_v (\mathbf{a}) .
\]
Similarly, if we define ψη by
ψη(v) = ψ(η, v), then η ↦ ψη is a linear mapping from U to Hom(V, W). Conversely, if ϕ : U ⇾ Hom(V, W) is linear, then the function ψ defined by
ψ(u, v) = ϕ (u)(v) is bilinear. Moreover, ψu = ϕ(u), so that ϕ is the mapping u ↦ ψu.
Example 5:
Composition of two linear transformations (S, T) ↦ S ∘ T is bilinear mapping. Let us consider a familiar example of composition of two matrices:
\[
\mathbf{A} = \begin{bmatrix} \end{bmatrix} , \qquad \mathbf{B} = \begin{bmatrix} \end{bmatrix} .
\]
■
End of Example 5
Theorem 3:
For a real vector space V and positive integer n, let ψ : ℝn × Vn ↦ V be a bilinear transformation defined via
\( \displaystyle \quad \psi \left( \mathbf{x}, \alpha \right) = \sum_{i=1}^n x_i \alpha_i . \quad \) The mapping α ↦ ψα is a linear maping from Vn to Hom(ℝn, V) and, in fact, is an isomorphism.
The linearity of ψ in x is verified by calculations because the formula is just a dot product. Then mapping α ↦ ψα is linear according to the previous theorem. Its bijectivity follows from the implicit function theorem.
Example 6:
■
End of Example 6
Multilinear forms
In light of the previous example, it might be tempting to think that the determinant
of a 3 × 3 matrix can be represented via a single fixed 3 × 3 matrix, but this
is not the case—the determinant of a 3 × 3 matrix is not a bilinear form, but
rather it is linear in the three columns of the input matrix. More generally, the
determinant of a p × p matrix is multilinear—linear in each of its p columns.
This generalization of bilinearity is captured by the following definition, which
requires that the function being considered is a linear form when all except for
one of its inputs are held constant.
Let V₁, V₂, … , Vp be vector spaces over the same field of scalars 𝔽. A function f : V₁ × V₂ × ⋯ × Vp → 𝔽 is called a multilinear form if, for each 1 ≤ j ≤ p and each v₁ ∈ V₁, v₂ ∈ V₂, … , vp ∈ Vp, it is the case that the function g : Vj → 𝔽 defined by
\[
g(\mathbf{v}) = f \left( \mathbf{v}_1 , \ldots , \mathbf{v}_{j-1} , \mathbf{v} , \mathbf{v}_{j+1} , \ldots , \mathbf{v}_p \right) \qquad \mbox{for all} \quad \mathbf{v} \in V_j
\]
is a linear form.