Hilbert spaces are a particularly nice class of Banach spaces. They axiomatize ideas from Euclidean geometry such as orthogonality, projection, and the Pythagorean theorem, but the ideas apply to many infinite-dimensional spaces of functions of interest to various branches of mathematics. Hilbert spaces are also fundamental to quantum mechanics, as vectors in Hilbert spaces (up to phase) describe (pure) states of quantum systems.
Today we’ll develop and discuss some of the basic theory of Hilbert spaces. As with the theory of Banach spaces, there are (at least) two types of morphisms we might want to talk about (unitary operators and bounded operators), and we will discuss an elegant formalism that allows us to talk about both. Things written by John Baez will be cited excessively.
Definition and introductory remarks
Let be a vector space over
or
. An inner product on
is a map
satisfying
(linearity in the second argument),
(conjugate symmetry; this implies conjugate linearity in the first argument),
and
(positive-definiteness).
(Linearity in the second variable is conventional in physics but in mathematics the convention is generally to have linearity in the first variable. We use the physics convention above for reasons explained in the next section.)
A vector space equipped with an inner product is an inner product space. Inner products generalize the ordinary dot product of vectors in , but the formalism applies to infinite-dimensional spaces such as various function spaces, allowing us to use geometric intuition from the former to understand the latter. In quantum mechanics, inner products are fundamental as they give rise to transition amplitudes (see for example the Born rule).
Any inner product spaces gives rise to a function which is readily seen to satisfy all of the axioms of a norm with the possible exception of the triangle inequality, which we now prove.
Cauchy-Schwarz inequality: let be vectors in an inner product space. Then
.
The Cauchy-Schwarz inequality can be proven in many ways (see for example Steele’s The Cauchy-Schwarz Master Class). Although it is stated here for an arbitrary inner product space, by restricting to the subspace generated by and
we see that it is really a statement about
-dimensional inner product spaces.
Proof. Consider the quadratic polynomial
.
By positive-definiteness, it cannot be negative, so its discriminant cannot be positive. This gives
and it follows that . Multiplying
by a complex number of absolute value
does not change the RHS, and it can make the LHS real and non-negative, giving the desired inequality.
Corollary: .
Proof. By Cauchy-Schwarz,
.
Following the above, for an inner product we call
the induced norm.
Corollary: For any inner product space and any
, the map
is a continuous linear functional of operator norm
with respect to the induced norm.
The identity should be thought of an abstract form of the law of cosines. In particular, if
(
are orthogonal), then the Pythagorean theorem
holds.
An inner product space is a Hilbert space if it is complete with respect to the induced norm.
Example. For any measure space with measure
, the space
is a Hilbert space with inner product
.
Special cases include the spaces for a set
as in the Banach space examples; wehn
is finite and we work over the reals we recover Euclidean space with the usual inner product. In quantum mechanics, a fundamental example is
with Lebesgue measure, as
is the space in which wave functions describing a particle in three spatial dimensions live. If
is a probability measure we can think of
as random variables, and if they happen to have expected value
then
is their covariance.
If is a real inner product space with induced norm
, then a straightforward computation shows that
and if is a complex inner product space a somewhat more tedious computation shows that
.
In any case, we conclude that the inner product uniquely determined by the norm it induces. Thus being Hilbert is a property of a Banach space up to isometric isomorphism. We can even characterize the Banach spaces with this property in a fairly straightforward manner: they are precisely the ones with norms satisfying the parallelogram identity
.
This is fairly annoying to prove, but it has a nice interpretation: if a norm is like the Euclidean norm in this particular respect, then it must be like the Euclidean norm in various other respects (coming from what can be proven using the inner product space axioms).
We might now be tempted to think of Hilbert spaces as a subcategory of , but we shouldn’t. For example, the product or coproduct of Hilbert spaces in
is almost never a Hilbert space; Hilbert spaces instead admit a direct sum coming from a generalized
-norm rather than a generalized
– or
-norm. This suggests that weak contractions aren’t a natural choice of morphisms between Hilbert spaces.
If we want to be permissive, we should take bounded linear operators as morphisms. If we want to be restrictive, we want all of the relevant structure to be preserved (namely the inner product), so we could take as morphisms maps such that
.
These include the unitary maps, which are the invertible maps with this property.
(Note that since the inner product is uniquely determined by a composition of linear functions and the norm, it follows that a linear operator between Hilbert spaces preserves the inner product if and only if it preserves the norm. Thus we may call a map satisfying the above property an isometry.)
We also make the following observation whose name will be explained below.
The Yoneda lemma for inner product spaces: Let be vectors in an inner product space such that
. Then
.
Proof. The above implies , so
, so
by positive-definiteness.
2-Hilbert spaces
The theory of real Hilbert spaces is a straightforward axiomatization of the properties of the dot product in Euclidean space, but the theory of complex Hilbert spaces includes an additional wrinkle, namely the issue of conjugate symmetry and the fact that the inner product is conjugate-linear rather than linear in one variable. Above I chose to have inner products be linear in the second variable rather than the first, and the reason is the following example.
Let be a finite group and consider the category
of finite-dimensional complex representations of
. For
with characters
, recall that we have
.
In other words, the dimension of spaces of intertwining operators defines an inner product on the complex vector space spanned by characters (formally, the tensor product where
denotes the Grothendieck group) which is naturally conjugate-linear in the first variable. Morally this is because Hom is contravariant in the first variable and covariant in the second.
This example is particularly interesting because in quantum mechanics the inner product of states describes the transition amplitude between them (in a sense that I don’t completely understand), and it would not be too far-fetched to think of transition amplitudes as being morphisms in some vague sense between states.
In this way we see that itself is a kind of categorified Hilbert space, with morphisms as a kind of categorified inner product. Decategorifying the Yoneda lemma for elements of
gives back the Yoneda lemma for inner products above. Decategorifying the isomorphism
gives conjugate-symmetry. Decategorifying the adjunction between, say, restriction and induction functors gives adjoint operators (see below). And so forth. For a further elaboration on this theme, see Baez’s Higher-Dimensional Algebra II: 2-Hilbert spaces.
Projections and complements
In , the ordinary dot product allows us to define the projection
of a vector onto another vector
. The above notation is somewhat confusing, as it takes two vectors as inputs when it should really take as input a vector
and a subspace
; the projection
should then be the closest vector in
to
. The above is just the special case that
.
We formalize this as follows. For and
, define the distance
.
(Of course this definition makes sense in any metric space.) Then is a closest vector in
to
if
. We say that
admits closest vectors if such a vector always exists for all
. (Note that such a subset is in particular closed.)
For general subsets , closest vectors are not guaranteed to be unique. However:
Proposition: Let be a subset of an inner product space
which is closed under taking midpoints. Then the closest vector
to a vector
is unique if it exists.
Proof. Suppose that are two closest vectors. By the parallelogram identity,
.
It follows that (which lies in
by assumption) is strictly closer to
than either
or
unless
, hence unless
.
Note that this is badly false in a general normed space. For example, in with the
norm, every vector
is closest among the vectors on the
-axis to the vector
.
In Euclidean space, projection is valuable among other things because it resolves a vector into two perpendicular components. The same is true in arbitrary inner product spaces.
Proposition: Let be a subset of an inner product space
which is closed under scalar multiplication. If the closest vector
to a vector
exists, then
.
Proof. By multiplying by a suitable unit complex number as necessary we may assume WLOG that is real. Since
is closest, the real function
has a local minimum at
. Its derivative there is therefore
.
Let be a subspace (necessarily closed) which admits closest vectors. Then it follows by the above that we may write any
as a sum
of a vector in and a vector in its orthogonal complement
.
We now need to introduce some important terminology. If are inner product spaces, their direct sum
can be given the inner product
.
This defines the direct sum of inner product spaces. If an inner product space has subspaces
such that
is the internal direct sum of
as vector spaces and moreover such that
are orthogonal, then
is an internal direct sum of
as inner product spaces, which we write as
.
Proposition: Let be a subspace of an inner product space
which admits closest vectors. Then
.
Proof. By assumption, every can be written as
where
. Since
, this sum decomposition is necessarily unique, which already implies that it must be linear. Since
is orthogonal to
by assumption,
has the direct sum inner product.
We can reformulate the above geometric discussion algebraically in terms of axioms that the map satisfies as follows. A projection on an inner product space
is a bounded linear operator
such that
is idempotent (
), and
is self-adjoint (
for all
).
We recall the following general result on idempotents.
Proposition: Let be a left
-module (
a ring, not necessarily commutative) and
be an idempotent morphism of
-modules. Then
admits a direct sum decomposition
.
Proof. We may write any as
. Since
, we have
. Conversely, if
then
, so
. Since
,
fixes any element of
, so
. Finally, since
is a morphism, its kernel and image are both submodules of
.
The converse is straightforward; hence studying idempotents in is equivalent to studying direct sum decompositions of
.
Applied to projections, we have the following.
Proposition: Let be a projection on an inner product space
. Then
admits a direct sum decomposition
.
In particular, , so a projection is uniquely determined by its image.
Proof. Everything follows from the last proposition except the last claim, which follows from self-adjointness:
.
The converse is again straightforward. Altogether we can summarize our discussion as follows.
Theorem: The following conditions on a subspace of an inner product space
are equivalent.
admits closest vectors.
.
- There exists a projection
such that
.
We turn now to the question of which subspaces have this property.
Proposition: Let be a finite-dimensional subspace of an inner product space
. Then
admits closest vectors.
Proof. Let . We want to show that there is a closest vector in
to
. Since
is at a distance
from
, it follows by the triangle inequality that the closest vector, if it exists, is necessarily contained in the closed ball of radius
centered at the origin in
. Since
is finite-dimensional, this ball is compact, so the function
attains its minimum.
This proof does not generalize to the infinite-dimensional case, since closed unit balls are no longer compact in this setting. By assuming that is a Hilbert space, we can substitute completeness for compactness.
Theorem: Let be a closed convex subset of a Hilbert space
. Then
admits a closest vector.
Proof. One direction is straightforward. In the other direction, let be a vector and let
be a sequence such that
.
By the parallelogram identity,
for any (note that this is the same use of the parallelogram identity as when we proved that closest vectors are unique). The RHS approaches
as
while
by definition, so it follows that
, hence that
is a Cauchy sequence. Since
is a Hilbert space and
is closed,
has a limit
satisfying
, hence this limit must be the closest vector.
Corollary: A subspace of a Hilbert space admits closest vectors if and only if it is closed.
Corollary: If is a subspace of a Hilbert space
, then
.
Proof. is a closed subspace of
such that
, hence by the above we have a direct sum decomposition
.
In any direct sum decomposition the two spaces are orthogonal complements of each other, so it follows that as desired.
Orthonormal bases
The theory of Banach spaces is unlike ordinary linear algebra in that Banach spaces do not admit a particularly good notion of basis. The linear-algebraic notion of basis, which only allows finite sums, is clearly unsuitable: it ignores the infinite sums which are now available, and spaces of functions don’t have reasonable Hamel bases anyway (see for example this math.SE question). The next obvious choice is to talk about Schauder bases, which are sequences in a Banach space
such that every
has a unique representation as an infinite sum
.
Unlike ordinary bases, Schauder bases must be ordered since the sum above is not required to converge absolutely. They also don’t always exist, even for separable Banach spaces; there is a counterexample due to Enflo. Finally, as far as I can see there is no guarantee that the function sending a vector to the coefficient
above is even linear, let alone continuous, due to the lack of absolute convergence.
But everything works out for Hilbert spaces. In any inner product space, a collection of vectors is orthonormal if they satisfy
. In particular, the
have norm
and are linearly independent, since if
then
for all
. An orthonormal basis of a Hilbert space
is an orthonormal set
whose span is dense in
.
Bessel’s inequality: Let be an orthonormal set and
a vector in an inner product space
. Then
for all but countably many
, and
.
Proof. Let be indexed by a set
and let
be a finite subset of
. Let
be the projection onto
. Then we may write
explicitly as
by inspection. Since , taking norms gives
for all finite subsets
. By exhausting every countable subset of
by finite sets, it follows that the inequality holds for all countable subsets of
. Because we cannot take uncountable sums of positive real numbers, it follows that
for all but countably many
, so the inequality holds for
.
Bessel’s inequality becomes an equality in the following case, which is an infinite-dimensional generalization of the Pythagorean theorem.
Parseval’s identity: Let be an at most countable orthonormal set. If
converges, then it converges absolutely,
, and
.
Proof. One direction is clear. In the other direction, let , let
. We have
by assumption, so the sum converges absolutely. Convergence implies convergence of norms, hence
. Finally, since
for all
, it follows by continuity that
.
We would like to conclude that orthonormal bases really are bases in a suitable Hilbert space sense, but first we need to prove the following.
Proposition: Let be an orthonormal set and suppose that
lies in the closure of the span of the
. Then
.
Proof. By the above, we may assume WLOG that is countable, indexed
. Let
be the projection onto
. Since
is the closest vector in
to
, it follows that
lies in the closure of the span of the
if and only if
.
Corollary: Let be a Hilbert space with an orthonormal basis
. Then the map
is a unitary isomorphism.
Proof. We showed above that preserves norms, so it remains to prove that it is linear.
clearly respects scalar multiplication, and it also clearly respects addition on the subspace of
consisting of sequences with finite support. Since
preserves norms, the rest follows by the continuity of
and addition.
This is a strong structure theorem for Hilbert spaces with an orthonormal basis. We now turn our attention to proving that an orthonormal basis always exists. The idea, known as the Gram-Schmidt process, is the following in finitely many dimensions.
Suppose are finitely many nonzero vectors in an inner product space. We’d like to find an orthonormal set of vectors
with the same span. We’ll do this inductively. First, set
. Assuming that
have been defined, let
denote the projection onto
. Now, if
is the smallest index such that
, we can set
It follows that is an orthonormal basis of
for all
.
The Gram-Schmidt process as defined here extends without fuss to countably many vectors .
Corollary: Every separable Hilbert space has an orthonormal basis.
In particular, the separable infinite-dimensional Hilbert space is unique up to unitary isomorphism. Thus physicists sometimes speak of “Hilbert space” (as in “vectors in Hilbert space”) by which they mean the unique separable infinite-dimensional Hilbert space.
To extend the Gram-Schmidt process to an arbitrary number of vectors in a Hilbert space, we use transfinite induction. If you don’t care about non-separable Hilbert spaces, you can stop reading here.
Let be a collection of vectors in a Hilbert space indexed by ordinals and define a corresponding orthonormal set
as follows. As above, we set
. If
has already been defined for all
, let
be the least ordinal such that
is not contained in the closure of
, let
be the projection onto this subspace, and define
.
Similarly, if has already been defined for all
for
a limit ordinal, let
be the least ordinal such that
is not contained in the closure of
, let
be the projection onto this subspace, and define
.
By transfinite induction the are orthonormal and
is an orthonormal basis for
(where
is defined as above in relation to
).
Corollary: Every Hilbert space has an orthonormal basis.
Unfortunately, this seems to require some form of choice. What we can prove in ZF is that every Hilbert space for which one can exhibit explicitly a dense well-ordered subset has an orthonormal basis.
Using orthonormal bases
Consider the Hilbert space where
carries normalized Haar measure. Equivalently, consider
with the inner product
.
The function separates points, so by Stone-Weierstrass the smallest algebra it contains which is closed under complex conjugation is dense in
in the uniform topology, hence in
. Since
is dense in
, it follows that in fact the algebra of complex polynomials is dense in
. Consequently,
is separable and has an orthonormal basis. The Gram-Schmidt process can be used to construct such a basis starting from the vectors
; these are, up to some normalization, the Legendre polynomials.
Another orthonormal basis comes from the observation that also separates points, so the span of the functions
, is also dense by Stone-Weierstrass. Happily, these functions are already orthonormal: we have
.
It follows that we may expand any function in in a Fourier series
.
We caution that what we have proven so far is only enough to conclude that Fourier series converge in , which says nothing about uniform or pointwise convergence; these are much more subtle matters. However, even just
convergence is enough to prove some nontrivial results. For example, we compute using integration by parts that
if and
since
is odd, hence
in . Taking norms of both sides, we conclude
.
This is the answer to the famous Basel problem. Replacing with
above gives us a method for evaluating
for all positive integers
.
Adjoints
The assignment defines an injection from any inner product space
to its dual space
(recall that this consists of bounded linear operators
). Moreover,
has norm
, so this injection is norm-preserving. However, it is conjugate-linear rather than linear. To fix this, we introduce for any complex vector space
the conjugate
(not to be confused with its closure in some ambient space!), which is the same abelian group as
but with scalar multiplication defined by the conjugate of scalar multiplication in
. (This only matters if we work over
rather than over
.) Then the inner product on any inner product space defines a linear norm-preserving injection
.
It is natural to ask when this map is an isomorphism (of normed vector spaces).
Riesz represenation: Let be a Hilbert space. Then the map
above is an isomorphism.
Proof. We know that it is linear, injective, and norm-preserving, so it suffices to prove that it is surjective. Let be a continuous linear functional. The claim is trivial if
is zero, so suppose
is nonzero.
is closed, so
admits a direct sum decomposition
.
Since is nonzero,
is nontrivial, and if
then
, so it follows that
is one-dimensional. If
is any nonzero vector, then
is a continuous linear functional which is trivial in
and nontrivial on its orthogonal complement, so must be equal to
up to a scalar.
The completeness of is essential. For example, let
be the space of compactly supported sequences
with the inner product induced from
. Then there is a continuous linear functional
sending such a sequence
to, say,
which is not of the form
for any
.
Corollary: Hilbert spaces are reflexive.
The Riesz representation theorem allows us to define the following crucial operation.
Theorem-Definition: let be a bounded linear operator. There exists a unique map
, the adjoint (or Hermitian adjoint) of
, which satisfies
.
Proof. For fixed the map
is a continuous linear functional on
, so by Riesz representation there exists a unique vector
such that
for all
. Moreover, by uniqueness
so the assignment is linear. Finally,
so is bounded (in fact has the same norm as
).
Remark. Let and let
be an orthonormal basis. Then
, which says precisely that the “matrix” of
with respect to the basis
is the conjugate transpose of the “matrix” of
.
Remark. The adjoint is closely related, but not identical, to the dual . If
are any two Banach spaces, then for any bounded linear operator
we may define its dual
on dual spaces, which is defined by precomposition. It is a corollary of the Hahn-Banach theorem that
, but the above argument does not need the Hahn-Banach theorem. If
are Hilbert spaces, then
is a map
, or equivalently by Riesz representation a map
, whereas the adjoint is a map
, so it is important not to confuse the two as mathematical objects; however, one is essentially the complex conjugate of the other.
The adjoint satisfies the following basic properties which follow straightforwardly from the definition. The second property shows that taking adjoints may be regarded as a generalization of complex conjugation for operators on Hilbert spaces.
,
(
a scalar),
,
.
The adjoint allows us to define the following important classes of linear operators. A bounded linear operator on a Hilbert space is
- self-adjoint if
,
- skew-adjoint if
,
- unitary if
,
- normal if
.
In quantum mechanics, self-adjoint operators play the role of real-valued observables. They should be thought of as the “real operators,” for example because their eigenvalues are necessarily real. Any operator can be written uniquely as the sum of a self-adjoint and skew-adjoint operator
.
Since is self-adjoint if and only if
is skew-adjoint, one can think of the above as a decomposition of an operator into its real and imaginary parts
, although this is not particularly useful unless the two commute (which is the case if and only if
is normal). When that happens, if
is an eigenvector of
with eigenvalue
, then
is an eigenvector of
with eigenvalue
(we will prove this below), hence
is an eigenvector of
with eigenvalue
and an eigenvector of
with eigenvalue
.
The unitary maps are precisely the invertible maps preserving the inner product. They form a group, the unitary group of
. A homomorphism
where
is a group is a unitary representation of
, and these are a very natural object of study. (See for example the Peter-Weyl theorem.)
The skew-adjoint maps form a Lie algebra under commutator, the unitary Lie algebra . These are precisely the maps
such that
is a continuous group homomorphism
. The proof is straightforward but we will defer it to the next post when it can be done in slightly greater generality.
The spectral theorem in finite dimensions
As a simple but important illustration of thinking in terms of adjoints, we prove the following.
Spectral theorem: Let be a self-adjoint operator on a finite-dimensional Hilbert space
. Then there exists an orthonormal basis of
consisting of eigenvectors of
, and all eigenvalues of
are real.
Proof. The first step is to prove that has an eigenvector. This is true for any linear transformation on a finite-dimensional complex vector space using, for example, standard facts about characteristic polynomials, but we will give an independent proof that more strongly suggests the correct generalization to the infinite-dimensional case.
Let be a vector of norm
such that
is maximized. (Such a vector exists by compactness.) We claim that is an eigenvector of
. To see this, let
and let
be a unit vector. Then
is a one-parameter family of unit vectors, and by assumption the function
has a local maximum at
. We compute that this is equal to
.
Its derivative at is equal to
.
Since we may scale by unit complex numbers without loss of generality, it follows that
for all
, hence
for some
. Since
it follows that is real. Finally, since
it follows that is an invariant subspace for
, so by induction we may complete
to an orthonormal basis of eigenvectors of
as desired.
An equivalent statement is that a self-adjoint operator is diagonalizable by a unitary operator. Since commuting operators act on each other’s eigenspaces, this is also true for normal operators (although the eigenvalues need no longer be real in this case). More generally, we can say the following.
Corollary: Let be a commuting family of normal operators on a finite-dimensional Hilbert space
. Then there exists an orthonormal basis
consisting of eigenvectors for all of the
.
In other words, the may be simultaneously diagonalized by a unitary operator.
A geometric interpretation of the spectral theorem is the following. Working over for simplicity,
is a self-adjoint operator if and only if the bilinear form
is symmetric. Associated to such a bilinear form is the quadratic form
from which it may be recovered. The spectral theorem shows that, letting
be an orthonormal basis and letting
be the corresponding eigenvalues, we may write
.
The “unit spheres” then describe shapes in
generalizing conic sections for
depending on how many of the
are positive, negative, or zero. For example, when
we may get ellipsoids or hyperboloids.
is positive-definite if and only if all of the
are positive, in which case
describes an ellipsoid. In this case the vectors
can be interpreted as the “principal axes” of the ellipsoid, which generalize the semimajor and semiminor axis from the case
, and the
are the squares of the reciprocals of the lengths of these axes.
The dagger category of Hilbert spaces
The category of Hilbert spaces has as morphisms the bounded linear operators. Since two Hilbert spaces which are bi-Lipschitz equivalent have orthonormal bases of the same cardinality, they are actually isometrically (equivalently, unitarily) isomorphic, but not every bi-Lipschitz equivalence is an isometry. We still want to talk about unitary maps in this setting, so how should we do that?
The answer is to explicitly make the adjoint part of the structure of . We define a dagger category, or
-category, to be a category
equipped with a contravariant functor
which is the identity on objects and which satisfies . More explicitly, for every pair of objects
there is a map
such that and
. In any dagger category, an endomorphism
is self-adjoint if
and an isomorphism
is unitary if
. A functor
between dagger categories is a dagger functor if
.
Example. Let denote the category of sets and relations. Recall that a relation
between two sets
is a subset of their Cartesian product
. We write
to mean that
is in this subset. Composition of relations is defined as follows: if
and
are two relations, then
is the relation defined by
.
(Note that this disagrees with the usual convention for function composition, where a function is realized as the relation
; what I call
would be for functions called
.) For intuition, you should think of a relation between two sets as defining a partially defined and nondeterministic function between them (“nondeterministic” is another way to say “multivalued” but I think it gives a better intuition).
is a dagger category with the dagger
defined by
.
A relation is self-adjoint if and only if it is symmetric, and every isomorphism is unitary (and is also a bijective function).
Example. Let be a positive integer. The category
of
–cobordisms is the category whose objects are
-dimensional compact manifolds and whose morphisms
are diffeomorphism classes of
-dimensional manifolds with boundary the disjoint union
. Composition in this category is defined by “sewing together” two manifolds at a common boundary component. (There are some subtleties here about maintaining a manifold structure when doing this that we will ignore completely.)
is a dagger category with the dagger given by switching the role of
and
; in other words, “turning cobordisms around.”
Heuristically speaking, the morphisms in describe time evolution between
-dimensional “spaces,” with the cobordisms describing
-dimensional “spacetimes.” (To make the connection to general relativity closer we should require, say, a Lorentzian structure on the cobordisms such that the boundary is a spacelike slice.)
is of fundamental importance to the subject of topological quantum field theory, which is roughly speaking the study of certain kinds of functors
. A unitary TQFT is a certain kind of dagger functor
, which can be thought of as a “functor from general relativity to quantum mechanics.” For an elaboration on this point of view, see Baez’s Physics, Topology, Logic, and Computation: a Rosetta Stone.
Example. Let be any category which admits finite pullbacks. The category
of spans in
is the category whose objects are those of
and whose morphisms
are diagrams
with composition defined by pullback. Given any span its dagger is simply obtained by switching
and
.
Spans of sets generalize relations in that they allow “multiple arrows” between an element of and an element of
. They also generalize cobordisms, since one can think of a cobordism as a cospan
where the two arrows are the two inclusions of the boundary components into the cobordism. For more about spans, see this page by Baez, which contains slides for a talk as well as references. The tale of groupidification is also relevant.
But let’s return to Hilbert spaces for the time being. Given that we can define unitary maps using only the adjoint, and unitary maps are the isomorphisms preserving the inner products, it seems that the adjoint already captures the inner product on a Hilbert space. This is in fact true.
We first need some notation. In , there is a distinguished object
, the one-dimensional Hilbert space
. This object represents the obvious forgetful functor to
in that
can be canonically identified with the vectors in
. Thus we may think of vectors in
as morphisms
.
Proposition: Let be vectors. Then
.
Proof. By definition, is the unique operator
satisfying
.
Since is a morphism
, it is just a scalar, so
and the conclusion follows.
In any dagger category with a distinguished object
(usually the identity object of a monoidal operation on
making it a dagger monoidal category) we may therefore define inner products of morphisms
taking values in
, and this inner product satisfies
, so the dagger behaves the same way with respect to it as the adjoint does for Hilbert spaces. Moreover, among the isomorphisms in
we can distinguish the unitary isomorphisms because they preserve inner products.
Example. In , a morphism
is a subset of
, so the functor
sends a set to its collection of subsets and sends a relation
to the function
.
(These functions are precisely the functions which preserve arbitrary unions.) If
are two subsets, then
is one of the two possible subsets of
, the empty set and the entire set; it is empty if
are disjoint and the entire set otherwise. Then the relation
when restricted to one-element subsets
says precisely that
.
Quantum weirdness is not so weird
It turns out that some important quantum phenomena, such as quantum teleportation, can be described in an abstract framework based on dagger categories. More precisely, we need dagger compact categories, which are dagger categories equipped with extra structure generalizing the tensor product and dual of Hilbert spaces. The nLab page on this subject has a nice list of references.
This suggests that part of the difference between classical and quantum mechanics boils down to the difference between dagger compact categories and a category like . A basic such difference is that in a dagger category, the two representable functors
and
are canonically (contravariantly) isomorphic, the isomorphism provided by the dagger operation. (A unitary isomorphism is then precisely an isomorphism which preserves both representable functors and which also preserves this identification between them.) This is very far from the case in a more classical category like
.
Replacing with
already helps a great deal. Since relations behave like nondeterministic functions, they are morally much more closely related to linear operators between vector spaces than to (deterministic) functions between sets. In some sense they already are linear operators: it is possible to think of relations as being matrices over the truth semiring
with addition defined by union and multiplication defined by intersection. For the special case of relations between finite sets, this is abstractly because
admits finite biproducts (given by the disjoint union) and every finite set is a biproduct of copies of
.
admits a monoidal operation given on sets by the Cartesian product. The fact that this is not the categorical product is reflected in the fact that “entangled states” exist: namely there are subsets of a Cartesian product
which cannot be obtained by taking the product of a subset of
with a subset of
.
further admits an internal hom
which is also given on sets by the Cartesian product (but it is contravariant in the first variable; remember that the underlying set here is
, so we get the set of subsets of the Cartesian product as we should), and the tensor-hom adjunction
holds, making a closed monoidal category and in fact a dagger compact category.
There is a lot more to say here, but it will have to wait for later posts.