“Eigenvectors and eigenvalues” is one of those topics that a lot of students find particularly unintuitive. Questions like “why are we doing this” and “what does this actually mean” are too often left just floating away in an unanswered sea of computations. And as I put out the videos of the series, a lot of you have commented about looking forward to visualizing this topic in particular. I suspect that the reason for this is not so much that eigen-things are particularly complicated or poorly explained. In fact, it’s comparatively straightforward and I think most books do a fine job explaining it. The issue is that it only really make sense if you have a solid visual understanding for many of the topics that precede it. Most important here is that you know how to think about matrices as linear transformations, but you also need to be comfortable with things like determinants, linear systems of equations and change of basis. Confusion about eigen stuffs usually has more to do with a shaky foundation in one of these topics than it does with eigenvectors and eigenvalues themselves. To start, consider some linear transformation in two dimensions, like the one shown here. It moves the basis vector i-hat to the coordinates (3, 0) and j-hat to (1, 2), so it’s represented with a matrix, whose columns are (3, 0) and (1, 2). Focus in on what it does to one particular vector and think about the span of that vector, the line passing through its origin and its tip. Most vectors are going to get knocked off their span during the transformation. I mean, it would seem pretty coincidental if the place where the vector landed also happens to be somewhere on that line. But some special vectors do remain on their own span, meaning the effect that the matrix has on such a vector is just to stretch it or squish it, like a scalar. For this specific example, the basis vector i-hat is one such special vector. The span of i-hat is the x-axis, and from the first column of the matrix, we can see that i-hat moves over to 3 times itself, still on that x axis. What’s more, because of the way linear transformations work, any other vector on the x-axis is also just stretched by a factor of 3 and hence remains on its own span. A slightly sneakier vector that remains on its own span during this transformation is (-1, 1), it ends up getting stretched by a factor of 2. And again, linearity is going to imply that any other vector on the diagonal line spanned by this guy is just going to get stretched out by a factor of 2. And for this transformation, those are all the vectors with this special property of staying on their span. Those on the x-axis getting stretched out by a factor of 3 and those on this diagonal line getting stretched by a factor of 2. Any other vector is going to get rotated somewhat during the transformation, knocked off the line that it spans. As you might have guessed by now, these special vectors are called the “eigenvectors” of the transformation, and each eigenvector has associated with it, what’s called an “eigenvalue”, which is just the factor by which it stretched or squashed during the transformation. Of course, there’s nothing special about stretching vs. squishing or the fact that these eigenvalues happen to be positive. In another example, you could have an eigenvector with eigenvalue -1/2, meaning that the vector gets flipped and squished by a factor of 1/2. But the important part here is that it stays on the line that it spans out without getting rotated off of it. For a glimpse of why this might be a useful thing to think about, consider some three-dimensional rotation. If you can find an eigenvector for that rotation, a vector that remains on its own span, what you have found is the axis of rotation. And it’s much easier to think about a 3-D rotation in terms of some axis of rotation and an angle by which is rotating, rather than thinking about the full 3-by-3 matrix associated with that transformation. In this case, by the way, the corresponding eigenvalue would have to be 1, since rotations never stretch or squish anything, so the length of the vector would remain the same. This pattern shows up a lot in linear algebra. With any linear transformation described by a matrix, you could understand what it’s doing by reading off the columns of this matrix as the landing spots for basis vectors. But often a better way to get at the heart of what the linear transformation actually does, less dependent on your particular coordinate system, is to find the eigenvectors and eigenvalues. I won’t cover the full details on methods for computing eigenvectors and eigenvalues here, but I’ll try to give an overview of the computational ideas that are most important for a conceptual understanding. Symbolically, here’s what the idea of an eigenvector looks like. A is the matrix representing some transformation, with v as the eigenvector, and λ is a number, namely the corresponding eigenvalue. What this expression is saying is that the matrix-vector product – A times v gives the same result as just scaling the eigenvector v by some value λ. So finding the eigenvectors and their eigenvalues of a matrix A comes down to finding the values of v and λ that make this expression true. It’s a little awkward to work with at first, because that left hand side represents matrix-vector multiplication, but the right hand side here is scalar-vector multiplication. So let’s start by rewriting that right hand side as some kind of matrix-vector multiplication, using a matrix, which has the effect of scaling any vector by a factor of λ. The columns of such a matrix will represent what happens to each basis vector, and each basis vector is simply times λ, so this matrix will have the number λ down the diagonal with 0’s everywhere else. The common way to write this guy is to factor that λ out and write it as λ times I, where I is the identity matrix with 1’s down the diagonal. With both sides looking like matrix-vector multiplication, we can subtract off that right hand side and factor out the v. So what we now have is a new matrix – A minus λ times the identity, and we’re looking for a vector v, such that this new matrix times v gives the zero vector. Now this will always be true if v itself is the zero vector, but that’s boring. What we want is a non-zero eigenvector. And if you watched Chapters 5 and 6, you’ll know that the only way it’s possible for the product of a matrix with a non-zero vector to become zero is if the transformation associated with that matrix squishes space into a lower dimension. And that squishification corresponds to a zero determinant for the matrix. To be concrete, let’s say your matrix a has columns (2, 1) and (2, 3), and think about subtracting off a variable amount λ from each diagonal entry. Now imagine tweaking λ, turning a knob to change its value. As that value of λ changes, the matrix itself changes, and so the determinant of the matrix changes. The goal here is to find a value of λ that will make this determinant zero, meaning the tweaked transformation squishes space into a lower dimension. In this case, the sweet spot comes when λ equals 1. Of course, if we have chosen some other matrix, the eigenvalue might not necessarily be 1, the sweet spot might be hit some other value of λ. So this is kind of a lot, but let’s unravel what this is saying. When λ equals 1, the matrix A minus λ times the identity squishes space onto a line. That means there’s a non-zero vector v, such that A minus λ times the identity times v equals the zero vector. And remember, the reason we care about that is because it means A times v equals λ times v, which you can read off as saying that the vector v is an eigenvector of A, staying on its own span during the transformation A. In this example, the corresponding eigenvalue is 1, so v would actually just a fixed in place. Pause and ponder if you need to make sure that line of reasoning feels good. This is the kind of thing I mentioned in the introduction, if you didn’t have a solid grasp of determinants and why they relate to linear systems of equations having non-zero solutions, an expression like this would feel completely out of the blue. To see this in action, let’s revisit the example from the start with the matrix whose columns are (3, 0) and (1, 2). To find if a value λ is an eigenvalue, subtracted from the diagonals of this matrix and compute the determinant. Doing this, we get a certain quadratic polynomial in λ, (3-λ)(2-λ). Since λ can only be an eigenvalue if this determinant happens to be zero, you can conclude that the only possible eigenvalues are λ equals 2 and λ equals 3. To figure out what the eigenvectors are that actually have one of these eigenvalues, say λ equals 2, plug in that value of λ to the matrix and then solve for which vectors this diagonally altered matrix sends to 0. If you computed this the way you would any other linear system, you’d see that the solutions are all the vectors on the diagonal line spanned by (-1, 1). This corresponds to the fact that the unaltered matrix [(3, 0), (1, 2)] has the effect of stretching all those vectors by a factor of 2. Now, a 2-D transformation doesn’t have to have eigenvectors. For example, consider a rotation by 90 degrees. This doesn’t have any eigenvectors, since it rotates every vector off of its own span. If you actually try computing the eigenvalues of a rotation like this, notice what happens. Its matrix has columns (0, 1) and (-1, 0), subtract off λ from the diagonal elements and look for when the determinant is 0. In this case, you get the polynomial λ^2+1, the only roots of that polynomial are the imaginary numbers i and -i. The fact that there are no real number solutions indicates that there are no eigenvectors. Another pretty interesting example worth holding in the back of your mind is a shear. This fixes i-hat in place and moves j-hat one over, so its matrix has columns (1, 0) and (1, 1). All of the vectors on the x-axis are eigenvectors with eigenvalue 1, since they remain fixed in place. In fact, these are the only eigenvectors. When you subtract off λ from the diagonals and compute the determinant, what you get is (1-λ)^2, and the only root of this expression is λ equals 1. This lines up with what we see geometrically that all of the eigenvectors have eigenvalue 1. Keep in mind though, it’s also possible to have just one eigenvalue, but with more than just a line full of eigenvectors. A simple example is a matrix that scales everything by 2, the only eigenvalue is 2, but every vector in the plane gets to be an eigenvector with that eigenvalue. Now is another good time to pause and ponder some of this before I move on to the last topic. I want to finish off here with the idea of an eigenbasis, which relies heavily on ideas from the last video. Take a look at what happens if our basis vectors just so happened to be eigenvectors. For example, maybe i-hat is scaled by -1 and j-hat is scaled by 2. Writing their new coordinates as the columns of a matrix, notice that those scalar multiples -1 and 2, which are the eigenvalues of i-hat and j-hat, sit on the diagonal of our matrix and every other entry is a 0. Anytime a matrix has 0’s everywhere other than the diagonal, it’s called, reasonably enough, a diagonal matrix. And the way to interpret this is that all the basis vectors are eigenvectors, with the diagonal entries of this matrix being their eigenvalues. There are a lot of things that make diagonal matrices much nicer to work with. One big one is that it’s easier to compute what will happen if you multiply this matrix by itself a whole bunch of times. Since all one of these matrices does is scale each basis vector by some eigenvalue, applying that matrix many times, say 100 times, is just going to correspond to scaling each basis vector by the 100-th power of the corresponding eigenvalue. In contrast, try computing the 100-th power of a non-diagonal matrix. Really, try it for a moment, it’s a nightmare. Of course, you will rarely be so lucky as to have your basis vectors also be eigenvectors, but if your transformation has a lot of eigenvectors, like the one from the start of this video, enough so that you can choose a set that spans the full space, then you could change your coordinate system so that these eigenvectors are your basis vectors. I talked about change of basis last video, but I’ll go through a super quick reminder here of how to express a transformation currently written in our coordinate system into a different system. Take the coordinates of the vectors that you want to use as a new basis, which, in this case, means are two eigenvectors, that make those coordinates the columns of a matrix, known as the change of basis matrix. When you sandwich the original transformation putting the change of basis matrix on it’s right and the inverse of the change of basis matrix on its left, the result will be a matrix representing that same transformation, but from the perspective of the new basis vectors coordinate system. The whole point of doing this with eigenvectors is that this new matrix is guaranteed to be diagonal with its corresponding eigenvalues down that diagonal. This is because it represents working in a coordinate system where what happens to the basis vectors is that they get scaled during the transformation. A set of basis vectors, which are also eigenvectors, is called, again, reasonably enough, an “eigenbasis”. So if, for example, you needed to compute the 100-th power of this matrix, it would be much easier to change to an eigenbasis, compute the 100-th power in that system, then convert back to our standard system. You can’t do this with all transformations. A shear, for example, doesn’t have enough eigenvectors to span the full space. But if you can find an eigenbasis, it makes matrix operations really lovely. For those of you willing to work through a pretty neat puzzle to see what this looks like in action and how it can be used to produce some surprising results, I’ll leave up a prompt here on the screen. It takes a bit of work, but I think you’ll enjoy it. The next and final video of this series is going to be on abstract vector spaces. See you then!