Determinants – Part 4

Ah we’ve finally arrived! We have defined, played with, and even proved the existence of alternating n-linear forms on a vector space V. Even better we have defined the determinant of a linear transformation, a concept not usually seen in elementary linear algebra (okay, that’s a bit of stretch, most everyone sees determinants of matrices and matrices of linear transformations, and it’s not too hard to define the determinant of a transformation to be, say, the determinant of its matrix with respect to some fixed basis. The difficulty here is in proving the determinant is independent of choice of basis. We have avoided this issue by going straight to linear transformations). So what is left to do here? Well, there are two main topics I’d like to address on this topic. First, I’ll talk a little about determinants of matrices and how they relate to alternating forms. Second, and I’m more excited to talk about this, I’ll talk about applications of determinants to integration theory and try to give some insight as to why this all works (I’m not going to get it all in this post, I’ll have to write an additional post on this subject, though this is the last post on the algebraic construction of determinants so if that’s the extent of your interest, enjoy the conclusion of my discussion of determinants!). A warning, some knowledge of measure theory will be assumed for the latter part of the post (not too much though!)

So, first let’s talk just a little about determinants of matrices. I won’t say much because it’s mostly computational, but there is one main result I want to state and prove. First, matrix multiplication is a linear transformation (technically defined by L(x)=Ax) the determinant of the matrix is defined to be the determinant of this associated linear transformation, Now, note that we can compute the determinant of a matrix straight from our definition, but it’s a pain. Basically, we know that if A is an n\times n matrix with respect to some basis (x_1,\dots, x_n) of V, then det(A)\omega(x_1,\dots,x_n)=\omega(Ax_1,\dots,Ax_n) for all alternating forms \omega on V, so if we replace the right hand side by \sum_i \alpha_{i,j}x_i (where the \alpha_{i,j} are the entries of the matrix), we can figure out the value of the determinant. I won’t go through the computation because, well, it’s painful and not very insightful. If you really want to see, check out Halmos’s book. Basically it turns out that det(A)=\sum_{\pi\in S_n}(\text{sgn}\pi)\alpha_{\pi(1),1},\dots,\alpha_{\pi(n),n}. So… yeah, painful!

Okay, more interesting, I’ve been saying all along that the determinant is an alternating n-linear form on the rows of the matrix, but if you notice we didn’t really define it that way and it’s kind of hard to see this connection immediately. We defined the determinant as the unique scalar associated with the linear transformation \overline{A} on the space of alternating n-linear forms on V. Now, as usual, I’ll prove the results about the actual linear transformation and then it can easily carry over to the matrix case, but here we go

Proposition: Let \{x_1,\dots,x_n\} be a basis for V and let \{y_1,\dots, y_n\} be vectors in V. Let L be the linear transformation defined by Lx_i=y_i and let \omega(y_1,\dots, y_n)=det(L). Then \omega is an alternating n-linear form.
Proof: Let \nu be a non-zero alternating n-linear form on V. By definition of determinants we have \nu(y_1,\dots,y_n)=\nu(Ly_1,\dots Ly_n)=(\overline{L}\nu)(x_1,\dots,x_n)=det(L)\nu(x_1,\dots,x_n). But note that c^{-1}:=\nu(x_1,\dots,x_n)\neq 0 because the x_i are a basis and hence are independent. But then we have det(L)=c\nu, and since the alternating n-linear forms on V form a vector space, it follows that indeed \omega=det(L) is an alternating n-linear form.

So how does this relate to matrices? Well, remember the definition of a coordinate matrix! A coordinate matrix for L is obtained by taking some basis in the domain space (above, the x_i), and evaluating the linear transformation on this basis. Then we take the coordinate vectors for the vectors Lx_i with respect to some basis in the range space (in our case, since L:V\mapsto V, we’ll take the second basis to be the x_i again), and so the matrix for Lx_i=y_i is the matrix with the y_i as its columns. Thus we have a function on the columns of the matrix, and we have shown it is alternating n-linear! Usually this is done as a function on rows, but it’s not hard to show that the determinant of a linear transformation equals the determinant of its adjoint, whose matrix is the transpose of the matrix of A (I think with respect to any basis, as long as you use the same one for both A and A', though it might only be with respect to the standard basis), and of course the columns of a matrix A are equal to the rows of A^T, so as usual the two are completely equivalent.

Here is one last paragraph on matrices, I know in the comments, I was asked by Andy and Jay to comment on the Cayley-Hamilton theorem and products of eigenvalues respectively. Unfortunately for both of you, you may by disappointed by my comments here, but perhaps I can write about these in the future (after I talk about some analysis!). For both of your questions, we need a much more developed theory of eigenvalues before we can really answer either question. The Cayley-Hamilton theorem is related to Jordan canonical form of a matrix (and the minimal polynomial), which is closely related to invariant subspaces and eigenvalues. If you are interested, and perhaps I will write about this later, Halmos basically states the Cayley-Hamilton theorem as a corollary in his exposition after developing the Jordan canonical form language, so it’s an easy-enough proof once the machinery is built up. As far as Jay’s question, i.e. why is the determinant the product of the eigenvalues, again I don’t have great intuition, but I can offer you a computational explanation. If we talk in terms of matrices, every matrix A is similar to its Jordan canonical form P (so A=PJP^{-1} for some invertible matrix P), and P is upper triangular. It’s easy to check that the determinant of any triangular matrix is the product of its diagonal entries, which are the eigenvalues for J. But since determinants distribute over multiplication (which is commutative in the scalar field) we have det(A)=det(P)det(P^{-1})det(J)=det(I)det(J)=det(J), which is the product of the eigenvalues. Again, sorry about the lack of insight, but again I think we need more language for eigenvalues before we can really tackle that question.

Okay! Now for the fun part! Let’s (start to) talk about integration. If you are rusty on your measure theory (or haven’t seen it before), I direct you to Adam’s introduction-to-measures series. Now, one of the nice properties of Lebesgue measure is its invariance under translation and rotations as well as its nice behavior with respect to dilation (m(rE)=r^nm(E) in \mathbb{R}^n). In other words, for example, if we fix some set E\subset\mathbb{R}^2 and, say, rotate it through some angle about the origin, then m(\theta E)=m(E), the measure does not change (which of course matches our geometric intuition about area [2-d volume] being invariant under movement). Of course this is not true of all measures, take for example the Dirac mass \delta_0 at zero in \mathbb{R}. Then \delta_0(-1,1)=1, but \delta_0(1,2)=0. So Lebesgue measure behaves nicely under certain plane motions, but what about general integration? What if E\subset\mathbb{R}^n and I know the value of \int_E f\hspace{1mm}dm and I want to know the value of \int_{E'} f\hspace{1mm}dm, where E' is, say the set E rotated or stretched by some factor? It turns out that if we have an invertible linear transformation T:\mathbb{R}^n\mapsto\mathbb{R}^n, we can always compute \int_{TE}f\hspace{1mm}dm given knowledge of \int_E f\hspace{1mm}dm. I’ll give you the formula now, but we’ll really need another post to delve into the details as to why this works. This theorem is quoted from Folland:

Theorem: Let T be an invertible linear transformation. If E\subset\mathbb{R}^n is Lebesgue measurable, then so is T(E) and m(T(E))=|det(T)|m(E)|. Moreover, if f is Lebesgue measurable then so is f\circ T. If f\geq 0 or f\in L^1(m), then \int f\hspace{1mm}dm=|det(T)|\int f\circ T\hspace{1mm}dm.

I won’t prove this here because it’s kind of boring without going into more details (which I will do in the next few posts of course!), but essentially Folland’s proof is your typical measure theory proof: start with the simplest functions possible: show that it works for compositions, then show it works for elementary linear transformations with simple functions, and build from there. For now though, I’ll prove one preliminary proposition that I’ll use in the next post. Recall that a measurable function is (like a continuous function between topological spaces) a structure preserving map under preimages, so the preimage of any measurable set is measurable. Let’s exploit this idea of measurable functions to construct a new measure from an old measure.

Definition/Proposition: Let (X,\mathcal{M}),(Y,\mathcal{N}) be measurable spaces, \mu be a measure on (X,\mathcal{M}), and G:(X,\mathcal{M})\mapsto(Y,\mathcal{N}) be a measurable map. Then the function \mu_*G:(Y,\mathcal{N})\mapsto[0,+\infty] defined by \mu_*G(E)=\mu(G^{-1}E) is a measure called the push-forward of \mu by G.
Proof: Clearly \mu_*G(\emptyset)=0, and if \{E_i\}_{i\in\mathbb{N}} are a collection of disjoint subsets of Y, then we have

\displaystyle\mu_*G\left(\bigcup_{i\in\mathbb{N}} E\right)=\mu\left(G^{-1}\left(\bigcup_{i\in\mathbb{N}}E_i\right)\right)=\mu\left(\bigcup_{i\in\mathbb{N}}G^{-1}(E_i)\right)

But since the E_i are disjoint, so are their preimages, and we can apply the disjoint additivity of \mu to conclude that indeed \mu_*G is a measure.

So you can already see where this is going. Since linear transformations on \mathbb{R}^n are continuous, they are measurable, so we can look at the push-forward T_*m on (\mathbb{R}^n,\mathcal{L}) to get a new measure that we can integrate with respect to. Now, can we understand how \mu_*G behaves with respect to integration? Of course! We know how it behaves with respect to simple functions, and in measure theory that is almost always enough.

Proposition: Let (X,\mathcal{M}),(Y,\mathcal{N}) be measure spaces and G_*\mu the push forward of a measure \mu to Y by G. If f\in\mathcal{L}^1(G_*\mu), then f\circ G\in\mathcal{L}^1(\mu), and

\displaystyle \int_Y f\hspace{1mm}d(G_*\mu)=\int_X f\circ G\hspace{1mm}d\mu

Proof: We prove the statement first for f=\alpha\chi_E. In this case, we have that, by definition, the integral on the left is \alpha G_*\mu(E)=\alpha\mu(G^{-1}(E)). But on the right, f\circ G=\alpha\chi_{G^{-1}(E)} (think about this a little), so the integral is the \mu measure of this set, so the two agree. Thus the statement holds for characteristic functions and so by linearity it holds for simple functions. Then for positive functions, the monotone convergence theorem gives the conclusion, and for general integrable functions we have the conclusion by considering positive and negative (real/imaginary) parts.

So, this is very cool stuff! We have a nice way of constructing new measures from old measures, and we even know how to integrate with respect to these “image”measures (in Schilling’s words) using our knowledge of the reference measure \mu. In the next post, I’d like to try to better understand this new measure G_*\mu in terms of derivatives of measures, so I’ll write a little bit about that topic, and then try to connect this idea of a push forward measure to the idea of a derivative of a measure, and then ultimately connect them back to determinants, and if all goes according to plan, we’ll have a nice little proof of my above proposition with, I believe, a much more in depth understanding of the underlying machinery than one gains by reading Folland.

Advertisements

About Ryan

I'm a software developer at Hudl where I work on awesome software. Before that, I was a grad student in mathematics, interested in probability theory as well as analysis, more on the side of functional analysis and less on the side of PDEs. Apart from that I'm pretty lame. Though I do enjoy watching football, playing golf, and playing the trumpet.
This entry was posted in Analysis, Linear Algebra. Bookmark the permalink.

One Response to Determinants – Part 4

  1. Z Norwood says:

    #edit: “\mathbb{R^2}” \mapsto “\mathbb{R}^2”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s