The Mysterious Rank (of a Matrix)

1 comment
In this post I will be talking about the simple and beautiful concept of matrices. In particular I will discuss about the rank of a matrix and its usage in solving system of linear equations. In order to keep the post to a reasonable length I will not dwell upon the usual operations on matrices and assume that the reader is familiar with them, but a paragraph or two about these operations definitely makes sense.

Introduction

A matrix is a system of some numbers (elements of a field) arranged in a rectangular form so that we can pin-point an element in a given row and column in the rectangular arrangement. If $ A$ is a matrix which has $ m$ rows and $ n$ columns we say that $ A$ is a matrix of order $ m\times n$. Also the element which is in row $ i$ and column $ j$ is denoted by $ A(i, j)$ or for simplicity by $ a_{ij}$ and in that case we write $ A = [a_{ij}]_{m\times n}$. Two matrices of same order can be added or subtracted by just adding or subtracting the corresponding elements.

It turns out that doing the same thing for multiplication does not make good mathematical sense in terms of usefulness. Hence matrices are multiplied in a seemingly unusual way which is associative but not commutative and moreover restrictive in the sense that not any given pair of matrices can be multiplied. For multiplication of matrices $ A$ and $ B$ in that order to be defined the number of columns in $ A$ must match the number of rows in $ B$. Hence it may be the case sometimes that $ AB$ is defined but not $ BA$. The complicated and unusual definition of matrix multiplication given below
$$ AB = C \text{ where } A = [a_{ij}]_{m \times n},\, B = [b_{ij}]_{n \times p},\, C = [c_{ij}]_{m \times p}$$ and $$c_{ij} = \sum_{k = 1}^{n} a_{ik}b_{kj}$$ turns out to be very useful and very natural once we learn their usage properly.

Actual matrices with numbers are in written in this way: $$ A = \begin{bmatrix} 1 & 2 & 3 \\7 & 8 & 9 \\4 & 5 & 6\end{bmatrix}$$ or sometimes in this way: $$ A = \begin{pmatrix} 1 & 2 & 3 \\7 & 8 & 9 \\4 & 5 & 6\end{pmatrix}$$ The first and simplest application of the matrix concept is the solution of linear equations to which we turn next.

System of Linear Equations

We are all familiar with problems of type "Find two numbers whose sum is 8 and difference is 2". These kinds of problems always boil down to having multiple linear equations in multiple variables like $ x + y = 8,\, x - y = 2$. The above equations belong to a general family called "simultaneous linear equations" or "system of linear equations". The word linear signifies that the variables in equations are of degree one. If we look closely at the sample system of equations given above we can see that the same can be written in matrix language as: $$ \begin{bmatrix} 1 & 1 \\1 & -1\end{bmatrix}\begin{bmatrix}x \\y\end{bmatrix} = \begin{bmatrix} 8 \\2\end{bmatrix}$$ or $$ AX = B$$ where $$ A = \begin{bmatrix} 1 & 1 \\1 & -1\end{bmatrix},\, X =\begin{bmatrix}x \\y\end{bmatrix},\, B = \begin{bmatrix}8 \\2\end{bmatrix}$$ Thanks to unusual rule of matrix multiplication we can write the above system of linear equations into a single matrix equation. In general a system of linear equations \begin{align} a_{11}x_{1} + a_{12}x_{2} + \cdots + a_{1n}x_{n} &= b_{1}\notag\\ a_{21}x_{1} + a_{22}x_{2} + \cdots + a_{2n}x_{n} &= b_{2}\notag\\ a_{31}x_{1} + a_{32}x_{2} + \cdots + a_{3n}x_{n} &= b_{3}\notag\\ \cdots \cdots \cdots &= \cdots\notag\\ a_{m1}x_{1} + a_{m2}x_{2} + \cdots + a_{mn}x_{n} &= b_{m}\notag \end{align} can be written in the form of a matrix equation as $$ AX = B$$ where $$ A = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\a_{21} & a_{22} & \cdots & a_{2n} \\\cdots & \cdots & \cdots & \cdots \\a_{m1} & a_{m2} & \cdots & a_{mn}\end{bmatrix},\, X = \begin{bmatrix} x_{1} \\x_{2} \\\cdots \\x_{n}\end{bmatrix},\, B = \begin{bmatrix} b_{1} \\b_{2} \\\cdots \\b_{m}\end{bmatrix}$$ The matrix $ A$ is called the coefficient matrix of the system. A related matrix which holds all the numbers present in the system of equations namely $$ \tilde{A} = \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} & b_{1} \\a_{21} & a_{22} & \cdots & a_{2n} & b_{2} \\\cdots & \cdots & \cdots & \cdots & \cdots \\a_{m1} & a_{m2} & \cdots & a_{mn} & b_{m}\end{bmatrix}$$ is called the augmented matrix of the system.

The solution of the system of equations then consists of the finding the matrix $ X$ (whose elements turn out to be the solutions of the given system). A question to ask first is: does the system of equations have a solution? The answer as we will prove later turns out to be one of the following three mutually exclusive and exhaustive options:
  1. System has no solution - we say that the system is inconsistent. Otherwise the system is said to be consistent.
  2. System has a unique solution.
  3. System has infinitely many solutions.
To find out which one of the three above options hold for a given system of equations requires us to use the (mysterious) concept of a rank of matrix. The word mysterious is used here because the initial presentation of this concept in high school textbooks is completely mysterious and its link with the solution of equations is even more so. We will define this concept later but let's assume for the present that the rank of matrix is a unique non-negative integer. Also rank of two different matrices may be same.

We now come to the fundamental theorem in the solution of equations which is stated as follows:

Solution of System of Linear Equations: Let the system of equations in $ n$ variables be represented in the matrix form as $ AX = B$ where $ A$ is the coefficient matrix and let $ \tilde{A}$ be the augmented matrix. Also let $ r$ be the rank of coefficient matrix $ A$ and $ r^{\prime}$ be the rank of the augmented matrix $ \tilde{A}$. Then we have $ r \leq r^{\prime} \leq n$. The system has:
  1. no solutions if $ r < r^{\prime}$
  2. a unique solution if $ r = r^{\prime} = n$
  3. infinitely many solutions if $ r = r^{\prime} < n$.
We next turn to the definition of the rank of a matrix and will start with the usual and highly contrived definition presented in high school textbooks. But to present that definition we require the concept of determinant of a square matrix (i.e. matrix with equal number of rows and columns). So let's again start with the common textbook definition of determinant which is recursive and highly contrived and mysterious as it does not lead to any of its properties (a proper definition of a determinant which avoids these issues will be presented in a later post).

Determinant of a Square Matrix

It turns out that with any given square matrix we can associate a unique number which finds many applications in operations related with matrices. This number we call the determinant of the matrix. Also note that this concept is defined only for square matrices. To start the determinant of a $ 1\times 1$ matrix $ A = [a]$ is simply defined to be the element $ a$. The determinant of a $ 2 \times 2$ matrix is defined as follows (also check the notation using vertical bars):
$$ A = \begin{bmatrix} a & b \\c & d\end{bmatrix},\,\, \text{det}(A) = \begin{vmatrix} a & b \\c & d\end{vmatrix} = ad - bc$$ To define the determinant of an $ n \times n$ matrix we proceed recursively and define it on the basis of determinants of $ (n - 1) \times (n - 1)$ matrices. To that end let $ A = [a_{ij}]_{n \times n}$ be a given square matrix of order $ n \times n$. For any element $ a_{ij}$ let $ A_{ij}$ be the matrix of order $ (n - 1) \times (n - 1)$ obtained from $ A$ by removing row $ i$ and column $ j$.

We define the cofactor $ C_{ij}$ of element $ a_{ij}$ by: $$ C_{ij} = (-1)^{i + j}\,\text{det}(A_{ij})$$ Then the determinant of $ A$ is defined by: $$\text{det}(A) = \sum_{j = 1}^{n} a_{1j}C_{1j}$$ For a $ 3\times 3$ matrix this means that $$ A = \begin{bmatrix} a & b & c \\d & e & f \\g & h & i\end{bmatrix},\,\text{det}(A) = \begin{vmatrix} a & b & c \\d & e & f \\g & h & i\end{vmatrix} = a\begin{vmatrix} e & f \\h & i\end{vmatrix} - b\begin{vmatrix} d & f \\g & i\end{vmatrix} + c\begin{vmatrix} d & e \\g & h\end{vmatrix}$$ This is called the expansion of determinant along the first row. The amazing property of determinants is that they can be expanded along any row (or any column) and the same value is obtained finally. This is one of the properties which is very difficult to establish using the given recursive definition of the determinant.

After a brief introduction to determinants we are now ready to define the rank of any arbitrary matrix.

The Mysterious Rank of a Matrix

Let $ A$ be a given matrix. A submatrix of $ A$ is any matrix obtained by removing some rows and columns from $ A$. The rank of matrix $ A$ is a non-negative integer $ r$ such that there is a square submatrix $ B$ of $ A$ with order $ r \times r$ and $ \text{det}(B) \neq 0$ and any square submatrix of $ A$ of order higher than $ r \times r$ has determinant zero.

Thus we can find $ r$ rows and $ r$ columns in $ A$ such that the determinant of the matrix formed by these rows and columns is non-zero, but if we take any set of more than $ r$ rows and columns the determinant of the resulting matrix is zero. Thus the rank is the order of the highest square submatrix with a non-zero determinant.

From the definition it is absolutely clear that the rank of a matrix $ A$ of order $ m \times n$ is less than or equal to the minimum of $ m$ and $ n$. Also it is clear that the rank of a matrix is equal to the rank of its transpose (we can't prove this here because it is dependent upon the (unproved so far in our posts) fact that the determinant of a square matrix is equal to the determinant of its transpose).

As an example we can see that the rank of the following matrix $ A$ is $ 2$ $$ A = \begin{bmatrix} 3 & 4 & 5 \\1 & 2 & 3 \\2 & 4 & 6\end{bmatrix}$$ because $ \text{det}(A) = 0$ and $$ \begin{vmatrix} 3 & 4 \\1 & 2\end{vmatrix} = 2 \neq 0$$ How does the concept of rank relate to the solution of "system of linear equations"? At this point this question seems baffling as we don't see any connection between this concept of rank based on determinants and the solutions of system of linear equations which is based on eliminating unknowns by adding / subtracting the given equations.

We will next focus on the usual method of solving linear equations as italicized in the last paragraph and will see that a systematic use of this method reveals the relationship of rank with the solution of system of linear equations. At the same time we will also see that this approach will lead to a more fruitful definition of the concept of rank of a matrix and thereby will demystify the mysterious rank. However to keep the current post from becoming unmanageable this will be slated for the next post.

Print/PDF Version

1 comment :: The Mysterious Rank (of a Matrix)

Post a Comment

  1. Lisbon from MSE : this is magnificent, and I hope I'll be able to direct more people here.