Why Does Eigendecomposition Matter?
It appears in the following sections of The Elements of Statistical Learning, a textbook used in machine learning classes, to explain how to compute parameters in statistical methodologies, so that I think understanding it will make us better understand statistics.
- Principal Component Analysis (PCA)
- Ridge Regression
- Linear Discriminant Analysis (LDA)
Big Picture
Eigendecomposition is a way to break down a given matrix into scalar (number) and vector (list of numbers). The vectors represent the special direction of the given matrix. The scalars represent how much special the corresponding vector is. The scalar is called eigenvalue, and the vector is eigenvector.
Eigenvalue
Eigenvalue is something we need to get eigenvector, tells us the scale of the eigenvector, and also used as how much important a certain thing is from the data in statistics.
How to Compute Eigenvalue
We start with the following equation.
That equation is saying that matrix multiplication with a vector can be expressed as just a scalar element-wise multiplication to the vector. We could have a huge size of the matrix
In linear algebra,
We wanna compute
We cannot do
People say that, in
In eigendecomposition, we exclude a case where
In
If
Eigenvector
Eigenvector is more important than eigenvector, but you need to compute eigenvalue before computing eigenvector
- Find all eigenvalues
- For each
, find a vector which is in the null space of a shifted matrix by , . That is
Typically, after finding those vectors, people normalize it to make the vector have a unit length.
To double check the process to compute eigenvector, first find eigenvalues, and shift the original matrix by the eigenvalue, and find the basis vector (meaning normalized unit length vector) for the null space of the shifted matrix.
Diagonalization of a Matrix with Eigenvalue and Eigenvector
Diagonalization in the context of eigendecomposition is to extract a special property of a given matrix in a form of a diagonal matrix by using eigenvalue and eigenvector.
After decomposition, we have a set of eigen values and the according eigen vectors. For example, for 3x3 matrix we have,
These can be written in matrix as below. Let
A matrix with eigenvectors (
That means that the first eigenvalue is multiplied with the first element of each eigenvector. But it's wrong, because eigenvalue and eigenvector is a pair. You cannot mix an eigenvalue with a different eigenvector.
So a set of
By right multiplying the both sides by
That can be interpreted as a hidden insight of a given square matrix
Also, after diagonalization, matrix powers will be easy. From the below equation,
Suppose getting
By regrouping the matrices
Because
Because
We just first did eigendecomposition, and then get the power, but it still works by first getting power and do eigendecomposition, because if we have
Because
We cannot change the order of matrix multiplication, but
Because
Because
So
Eigenvalue of Diagonal Matrix
Eigenvalues of a diagonal matrix are diagonal elements of the diagonal matrix. For example,
So we can imagine that, even if a diagonal matrix gets bigger, off-diagonal elements will disappear by 0 multiplication in characteristic equations and it gives us a bunch of
Eigenvalue of Triangular Matrix
Eigenvalues of a triangular matrix has the same result as a diagonal matrix. Regardless of upper triangular matrix or lower triangular matrix, eigenvalues will be diagonal elements of the triangular matrix. For example in 2x2 upper triangular matrix,
Because when computing determinant and when making multiplication, 0 at the either side of the off-diagonal elements cancels the other, so off-diagonal elements will disappear.
Repeated Eigenvalue
Distinct eigenvalues
For example,
By characteristic equation to find eigenvalues,
When
For
Eigendecomposition of Symmetric Matrix
Eigendecomposition of symmetrix matrix has the following 2 properties, assuming it has distinct eigenvalues
- Eigenvalues are real values (no complex numbers).
- Eigenvectors are all pair-wise orthogonal, meaning
If two vectors are orthogonal, the dot product is 0, so start from making a dot product. Below
Because
By LIVE EVIL rule,
Because a given matrix is symmetrix (
Becuase
Extract the first and the last expression,
Because eigenvalues are distinct,
It means that
Because
It means that we don't need to do inverse calculation (difficult compute and time consuming). We can replace it with just a transpose.
Eigendecomposition of Singular Matrix
Eigendecomposition can also be applied to singular matrix (non-invertible matrix, its determinant is 0). In such case, there will be at least one 0 eigenvalue. To know the rank of a singular matrix, we can count the number of non-zero eigenvalues. The determinant is equal to the product of all eigenvalues.
Diagonalization as Sum of Matrices
From a given square matrix
Because
This is a mx1 column vector
We can arbitrarily choose any number
Numpy (and most of the other softwares) by default normalizes eigenvectors to have a unit length. It means that
Eigenvectors point in the important direction in the space of matrix