The Riesz Representation Theorem – Part 1

Hello everyone! I realize it’s been far too long since I’ve last posted, and I decided I don’t really want to write about Radon-Nikodym anymore. Maybe someday if I get requests I’ll write a couple more posts in that series, but for now I’m done. Long story made short, we can often use the determinant of a linear transformation and some given reference measure to compute measures of sets under said linear transformations (or, if you like, one can generalize to smooth coordinate transformations by using Jacobian, which is a “locally linear” coordinate change).

Anyways, I want to write about something different. Last year in the office, we began discussing the neatest results we had seen in our undergraduate courses. We listed some neat facts, and as I recall Jay enjoyed the result from linear algebra that given any x_0\in[0,1], there is a unique polynomial q(x) so that p(x_0)=\int_0^1p(x)q(x)\hspace{1mm}dx. This is of course a special case of the famous Riesz Representation theorem. This theorem has many different statements depending on the context it’s used in. For our linear algebra class, we were given this particular one:

Theorem: Riesz Representation Theorem: Let V be a finite dimensional vector space over \mathbb{C} and \langle\cdot,\cdot\rangle an inner product on V. If f:V\to\mathbb{C} is a linear functional on V, then there is a unique vector z_f\in V so that f(x)=\langle x,z_f\rangle for all x\in V.

The result about polynomials is then a special case, where V=P_n (on [0,1]), \langle p,q\rangle=\int_0^1 p(x)q(x)\hspace{1mm}dx, and f:P_n\to\mathbb{C} is given by f(p)=p(x_0). The proof of this particular version of the RRT is not too difficult. When working in a finite dimensional space, one simply writes down a vector that works, and uniqueness follows from a simple calculation. I leave it to the reader to verify the proof of this particular version of Riesz. If you get stuck, check out (for example) Axler’s book Linear Algebra Done Right or Halmos’s Finite Dimensional Vector Spaces.

One of the (perhaps shocking) results in mathematics is that this result carries over to the case of infinite dimensions, provided we restrict ourselves to continuous functionals and complete inner product spaces. It turns out that in finite dimensional normed spaces (hence inner product spaces), every linear map is continuous, and all norms are equivalent. In fact we can pretty much show that any finite dimensional space looks like \mathbb{R}^n (which is complete) where n is the dimension of the space in question. But what about the case of infinite dimensions? What if I were to replace P_n with the space of all continuous functions on [0,1]? Maybe I want to use the space l^2(\mathbb{N}) whose elements are all square summable sequences of complex numbers. What if I don’t have an inner product? Is there still a way to represent functionals by other vectors in my space (or a similar space)? In this set of posts, I’ll develop the machinery to, and ultimately prove stronger versions of the Riesz Representation theorem that apply to infinite dimensional vector spaces as well.

As I mentioned above, we need to put a few more restrictions on the functions and spaces we will consider when we pass over to infinite dimensions. I’ll spend the rest of this post talking about this. If we are working in an inner product space, we get a natural norm given by \langle x,x\rangle^{1/2}. In fact many of these results about functions will carry over to general normed spaces, so let’s consider these. As I mentioned above, in finite dimensional linear algebra, any two norms on a space are equivalent, and it turns out every linear map is continuous (with respect to the norms), so often in a basic linear algebra course (or even in an advanced linear algebra course), continuity is not discussed. Unfortunately, when we pass to infinite dimensions we do in fact lose this pleasantry. I’ll give you an example of a discontinuous linear map shortly, but first, here’s a nice characterization of continuity for linear maps.

Proposition Let X,Y be normed linear spaces and let T:X\to Y be a linear map. Then T is continuous if and only if there is a constant C>0 so that ||fx||_Y\leq C||x||_X for all x\in X. Moreover, T is continuous if and only if it is continuous at 0.

I’ll leave the proof of this fact to the reader, it’s a relatively straightforward calculation. The important thing is we can now make the following:

Definition Let f:X\to Y be a linear map between normed spaces. If f is continuous, we say f is bounded. We denote ||f||=inf\{C:||f(x)||\leq C||x||\text{ for all }x\in X\} and we call this number the norm of the function f. We denote by L(X,Y) the set of all bounded linear maps f:X\to Y. If Y=\mathbb{C}, we call L(X,\mathbb{C}) the dual space of X and denote it by X^*. The reader may verify that the function ||\cdot||:L(X,Y)\to\mathbb{R} is a norm on L(X,Y) for any space Y.


  1. Let X=C^1([0,1]) be the set of all continuously differentiable functions on [0,1] and define D:C^1([0,1])\to C([0,1]) by D(f)=f' and we use the uniform norm on both spaces, i.e. ||f||=sup\{|f(x)|:x\in[0,1]\}. Then D is linear (by elementary properties of derivatives), but


    so D is not bounded and hence not continuous. It’s perhaps interesting to note that this space C^1([0,1]) is not complete in this norm! For example, the Stone-Weierstrass theorem tells us that the function f(x)=|x| can be uniformly approximated by polynomials (which are continuously differentiable infinite many times!) but of course f' is not even defined at x=0. To resolve this, one can use the norm ||f||_1=||f||_u+||f'||_u, where ||\cdot||_u is the uniform norm, on C^1([0,1]). Then (exercise) C^1([0,1]) is complete with respect to this norm. Moreover, for any f\in C^1([0,1]), we have

    ||D(f)||_u=||f'||_u\leq ||f'||_u+||f||_u=||f||_1,

    so indeed D is bounded with respect to this norm, and ||D||\leq 1! This shows that the two norms ||\cdot||_1 and ||\cdot||_u are not equivalent on C^1([0,1]). As you can see, things really do start to behave strangely in infinite dimensions!

  2. Let H be a finite dimensional inner product space with orthonormal basis \{u_1,\dots, u_n\} with norm ||x||=\langle x,x\rangle^{1/2}. Define U:H\to\mathbb{R}^n (with its usual norm) by U(\sum_1^n a_nu_n)=(a_1,\dots,a_n)=\sum_1^n a_ne_n, where e_n denotes the standard basis for \mathbb{R}^n. By the Pythagorean theorem, for all x\in H, we have

    ||x||=||\sum_1^n a_nu_n||=\sum_1^n|a_n|^2=||\sum_1^n a_ne_n||=||Ux||.

    Thus U is continuous, ||U||=1, and in fact one can show that \langle Ux,Uy\rangle_{R^n}=\langle x,y\rangle_{H} for all x,y\in H. U is an example of a unitary operator, which are the isomorphisms in the category whose objects are Hilbert spaces and whose morphisms are linear maps between them.

  3. Let X be a normed space. For x\in X, define \hat x:X^*\to\mathbb{C} by \hat x(f)=f(x). The reader may verify that \hat x is linear, \hat x\in (X^*)^*, and ||\hat x||=||x||. The space (X^*)^* is often denoted X^{**}. The map x\mapsto\hat x is a linear isometry (i.e. a norm-preserving linear map. Note that if T is an isometry, then Tx=0 implies ||Tx||=0, thus ||x||=0, so x=0, i.e. any isometry is automatically an injection, and hence is an isomorphism onto its range) mapping X to X^{**}. I’ll often denote by \widehat X the image of X under this correspondence. A Banach space is a normed vector space that is also complete with respect to the metric d(x,y)=||x-y||. The reader may verify that X^{**} is always a Banach space (in fact X^* is always a Banach space) and from this it follows that X is complete if and only if \widehat X is a closed subspace of X^{**}. Moreover, if X is not complete, the map x\mapsto\hat x embeds X as a dense subspace of its closure, which is itself a Banach space. The closure of \widehat X in X^{**} is called the completion of X (with respect to its norm). If \widehat X=X^{**}, we say X is reflexive

Well, I don’t have a lot more to say for tonight as far as theorems and propositions go, but I’ll make just a few more comments on the above examples before I finish this post. In example three, the space X^{**} is called the double dual of X. If X is finite dimensional with some basis \{x_1,\dots, x_n\}, define u_i\in X^* by u_i(x_j)=\delta_{i,j}, i.e. \delta_i(x_i)=1 and \delta_i(x_j)=0 for j\neq i. The set \{\delta_i\}_1^n is a basis for X^*, and hence X is isomorphic to its dual. The same argument shows that X^*\simeq X^{**}, so in finite dimensions every space is reflexive, and so the map x\mapsto\hat x is indeed surjective. In infinite dimensions, there are examples of spaces that are not reflexive. WARNING: if X is an infinite dimensional vector space, let X' be the set of all linear functionals on X, and X'' the set of all linear functionals on X' (i.e. X',X'' are the algebraic dual and double dual of X). Then X is NEVER isomorphic to X''. This is one of the first places where our analytic notions pay some dividends. For example, the space l^2(\mathbb{N}) from above will turn out to be reflexive, even though its algebraic dual is much larger than the original space.

One final comment: if X is a finite dimensional space, the map x_i\mapsto \delta_i gives a (somewhat) canonical identification of X with X^*. In infinite dimensions, such a correspondence need not exist. We could try a similar trick, by writing down a basis for X and defining \delta_i in the same way. Unfortunately, this map need not be continuous. We could try and resolve this by defining \delta_i:\text{span}\{x_i\}\to\mathbb{C} by \delta_i(ax_i)=a. The Hahn-Banach theorem (which maybe Adam could write about sometime?) tells us that \delta_i can be extended to a bounded map on all of X, but unfortunately we have absolutely no idea how to evaluate f(x) if x\notin\text{span}\{x_i\}! One of the nice consequences of the Riesz Representation theorem is that it gives us a canonical map from an inner product space to its dual. We have this for finite dimensional spaces, and hopefully in the next couple of posts, I’ll show you that in fact we can do this for infinite dimensional spaces as well. Until then, enjoy!


About Ryan

I'm a software developer at Hudl where I work on awesome software. Before that, I was a grad student in mathematics, interested in probability theory as well as analysis, more on the side of functional analysis and less on the side of PDEs. Apart from that I'm pretty lame. Though I do enjoy watching football, playing golf, and playing the trumpet.
This entry was posted in Analysis, Linear Algebra. Bookmark the permalink.

3 Responses to The Riesz Representation Theorem – Part 1

  1. Jay Cummings says:

    Great post, Hotpants! You mentioned in Example 1 that C^1([0.1]) is an infinite dimensional space. I’m curious if you have a basis in mind. Maybe Fourier basis? Or is there a “simpler” one.

  2. Ryan says:

    Good question Jay! That actually brings up some interesting questions. First, note that P=\{x^n:n\in\mathbb{N}\} is an infinite, linearly independent subset of C^1([0,1]), so this space really is infinite dimensional. When you say basis, you have to be careful though! There are two notions of basis: one is an algebraic basis (sometimes called a Hamel basis), in which case I’m not sure I can write one down for you (as Zach has written about, existence of a Hamel basis for an arbitrary vector space is equivalent to the axiom of choice). The thing with Hamel bases is that every element of X must be expressible as a finite linear combination of basis elements. Note that since we are using finite sums, there are no issues of convergence.

    This is different from an analytic (Schauder) basis, where we essentially allow infinite sums. To be precise, a Schauder basis is a set \{b_\alpha\}_1^\infty of a Banach space X so that for every x\in X there is a unique sequence \{c_n\}_1^\infty of scalars so that \sum_1^\infty c_nb_n=x, where convergence is with respect to the norm. Note, I think you can do this construction for non-separable spaces, you just have to ensure that for all x all but countably many coefficients are non-zero. I know you can do this for a Hilbert space (will talk about this in later posts), but I’m not 100% sure for arbitrary Banach spaces. Now, by Stone-Weierstrass, polynomials are uniformly dense in C^1([0,1]), and I believe one can show by a greedy algorithm argument that in fact \overline{\text{span}\{1,x,x^2,\dots\}}=C^1([0,1]) (I’m 95% sure this is true; Wikipedia claims that C([0,1]) admits a Schauder basis in the uniform norm. I’m pretty sure this construction above works for C^1([0,1]).) The Fourier basis is used for Hilbert space and its associated norm. I’ll talk about this in a post later, but essentially it will happen that given any f\in L^2([0,1]), we have f=\sum_{n\in\mathbb{Z}} a_n e^{inx}, where convergence is now in the L^2 norm, i.e. the integral of their square difference goes to zero.

    So, sorry for the long response… hope this answers your questions!

    • Jay Cummings says:

      Don’t apologize for the long response, that was great! Thanks! I would say that you have made me look forward even more to your next post, but as I thought that my email dinged telling me that you have *already* posted again. Excellent!

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s