Singular Value Decomposition: SVD
Explore the concept of Singular Value Decomposition SVD, its role in matrix factorization, and how to compute it using Python. Understand orthogonal diagonalization, the significance of singular values and vectors, and the economic SVD approach to handle zeros in the singular values. This lesson helps you grasp the mathematical foundations and practical implementations essential for data science applications.
Definition
The Singular Value Decomposition (SVD) of an matrix, , is the factorization of , multiplied by the product of three matrices:
, where the matrices and are orthogonal and the matrix, , is a
SVD in numpy
We can compute SVD of a matrix, A, using U,S,Vt = numpy.linalg.svd(A), where S is an array containing diagonal entries of and Vt is .
The diag function in the implementation below converts the S array into the generalized diagonal matrix, . We’ve used the D variable to represent in the code.
Singular values and vectors
- The diagonal values of are the singular values of .
- The columns of are the left singular vectors of .
- The rows of are the right singular vectors of .
We assume that the singular values in are in descending order. That’s how
numpy.linalg.svdreturns the values inS. We can arrange the values in any order if we’re provided the corresponding arrangement of the columns of and the rows of .
Economic SVD
When has some zeros in the diagonal, we can drop them. We can also drop the corresponding columns in and the rows in . If is the diagonal matrix with only non-zero singular values, and and are the matrices of the corresponding columns and rows in and , respectively, then
is the economic SVD of , where the matrices, and , may not be square and the matrix, , is diagonal. The figure below illustrates the relationship between SVD and economic SVD.
Implementation of economic SVD
The economicSVD function below implements the economic SVD for a matrix, A. It’s worth noticing that the number of non-zero singular values of A is equal to the rank of A.
SVD: the heart of linear algebra!
Consider that the rank of an matrix, , is , as shown in the figure above.
- The first columns of form an orthonormal basis of column-space of , that is, .
- The first rows of form an orthonormal basis of row space of , that is, .
- The last columns of form an orthonormal basis of the null space of , also known as the left null space of .
- The last rows of form an orthonormal basis of the null space of .
Note: The SVD of a matrix tells us a lot about the matrix.
Derivation of SVD
To derive the matrices, , and for an matrix, , we first note that the matrices, and , are both symmetric. This is because and . Furthermore, the eigenvalues of and are non-negative. When is an eigenvalue of , corresponding to an eigenvector , we have:
The proof for is similar.
Note: A symmetric matrix with non-negative eigenvalues is called a positive semi-definite matrix.
We need to find matrices , and , such that:
- .
- .
- is a generalized diagonal matrix, and hence, and are diagonal matrices with the same diagonal entries.
- .
If , then:
In the LaTeX shown above, is an orthogonal diagonalization of . This means that is a matrix of orthonormal eigenvectors of and is the diagonal matrix of corresponding eigenvalues. Similarly, we can derive that , where is a matrix of orthonormal eigenvectors of and is the diagonal matrix of corresponding eigenvalues. Because the diagonal values of and are the same, it’s convenient to omit the subscript and call it . Finally, is a generalized diagonal containing the square roots of the diagonal entries of . Because both and are positive semi-definite, all diagonal entries of are non-negative, thus giving real square roots.
SVD algorithm
Given a matrix .
- To compute the eigenvalues, , and eigenvectors, , of , we can use the code below:
D,U = np.linalg.eig(A.dot(A.T))
- We can then compute the eigenvectors, , of by using the following code:
D,V = np.linalg.eig(A.T.dot(A))
- Finally, we make a generalized diagonal matrix, , with the square roots of , by using this code:
Sigma = diag(D,A.shape[0],A.shape[1])**0.5