In the previous post we formed notions of matrix, determinants, rank of a matrix and system of linear equations. We also mentioned the fundamental theorem on solution of system of linear equations which tells us the importance of rank in deciding the nature of solution of such systems of equations. It is now time to establish the theorem and to do that we start by defining a systematic method of solving such systems of linear equations.
Sometimes we might need to multiply the equation by suitable number before adding or subtracting. For example consider: \begin{align} x + 2y &= 7\notag\\ 2x - y &= 4\notag \end{align} Here we need to multiply the first equation by $ 2$ and then subtract the resulting equation from the second equation to get $ -5y = -10$ so that $ y = 2$ and then $ x = 3$. From these simple examples it is clear that the solution of such systems of equations consists of two parts:
Clearly any solution of the original system also satisfies this new equation, but there may be other values of the variables which satisfy this new equation. We can make many more linear combinations to get newer equations in this fashion to get a new system of linear equations. And then any solution of the original system is also a solution of the newer system, but the newer system might have some solutions which are not the solution to the original system. However if we are also able to express the equations of the original system as a linear combination of the equations of newer system, then it follows that both the systems of linear equations have the same solution.
If two systems of linear equations have the same set of solutions we say that they are equivalent. What we have shown above is that:
If two systems of linear equations are such that any equation of one system is a linear combination of the equations of the other system and vice-versa, then both the systems are equivalent.
The idea of solving these systems of equations is then to find simpler but equivalent systems by making linear combination of existing equations. Moreover the goal is to eliminate variables one by one which means that the linear combinations be so chosen that the coefficients of some variables in the resulting equations turn out to be zero. We further note that while making linear combination of equations we are just manipulating the coefficient of the variables and hence it might be better to drop the variables altogether and work with the coefficient/augmented matrix of the system. The process of linear combination of the equations thus leads to a corresponding operation on the rows of the augmented matrix.
It turns out that we need to have only those kind of operations on the rows of the matrix which lead to equivalent systems. Thus we should be able to get back the original matrix by applying suitable operation. With this requirement we define the elementary row operations on a matrix.
We can now apply these row operations on the augmented matrix $ \tilde{A}$ of the system of equations and try to simplify it in a form which resembles the process of elimination of variables. Thus we would want to reduce the augmented matrix into a form which contains many zero entries. A systematic description of such a form of matrix is given below:
Row Reduced Echelon Form: A matrix is said to be the in the row-reduced echelon form if it satisfies the following properties:
$$ A = \begin{bmatrix} 1 & 2 & 0 & 0 & 3 \\ 0 & 0 & 1 & 0 & 2 \\ 0 & 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}$$ is in row-reduced echelon form because:
We now have the machinery to formulate a new definition of rank of a matrix and thereby put this concept in proper context. To that end we need to observe that if $ A$ is a square matrix and $ e$ is an elementary row operation then $ \text{det}(A) = 0 \Rightarrow \text{det}(e(A)) = 0$ and $ \text{det}(A) \neq 0 \Rightarrow \text{det}(e(A)) \neq 0$. Thus the rank (as defined using determinants of square submatrices) of any matrix does not change when we apply an elementary row operation on it. Hence the rank of a matrix is same as the rank of the corresponding matrix in the row-reduced echelon form. Clearly the rank of a matrix in row-reduced echelon form is the number of non-zero rows in it (because we can choose these rows and the columns which contain first non-zero entries of these rows to give a non-zero determinant, and choosing any more rows will always include a full row of zeroes and thereby lead to a zero determinant). We thus have the following result:
The rank of a matrix is the number of non-zero rows in the corresponding row-reduced echelon form.
Since the rank is a unique number, it follows that the number of non-zero rows in row-reduced echelon form is also unique and depends only on the original matrix and not on the sequence of elementary row operations used for reducing it to echelon form. In fact it is also true that the row-reduced echelon form itself is unique for a given matrix and does not depend upon the sequence of elementary row operations used to reduce the given matrix in echelon form (but we don't prove this fact here). The above result can also be taken as an alternative and proper definition of the rank of a matrix (in fact this is also a common approach in undergraduate texts).
We are now ready to prove the fundamental theorem of solution of system of linear equations and thereby show the importance of the concept of rank in these matters. However, this post has already grown a bit large in size, and we need to postpone these developments for the next and final post in this series.
Print/PDF Version
Elementary Row Operations on a Matrix
Let's start with the simple system of equations: \begin{align} x + y &= 8\notag\\ x - y &= 2\notag \end{align} If we add these equations we get $ 2x = 10$ so that $ y$ is eliminated and we get $ x = 5$. Putting this again in first equation we see that $ y = 8 - x = 3$. The basic idea is to eliminate one of variables (thereby reducing complexity of the system) by adding or subtracting the given equations.Sometimes we might need to multiply the equation by suitable number before adding or subtracting. For example consider: \begin{align} x + 2y &= 7\notag\\ 2x - y &= 4\notag \end{align} Here we need to multiply the first equation by $ 2$ and then subtract the resulting equation from the second equation to get $ -5y = -10$ so that $ y = 2$ and then $ x = 3$. From these simple examples it is clear that the solution of such systems of equations consists of two parts:
- Elimination of unknowns one by one to finally get only one variable
- Back substitution to get value of variables one after another
Clearly any solution of the original system also satisfies this new equation, but there may be other values of the variables which satisfy this new equation. We can make many more linear combinations to get newer equations in this fashion to get a new system of linear equations. And then any solution of the original system is also a solution of the newer system, but the newer system might have some solutions which are not the solution to the original system. However if we are also able to express the equations of the original system as a linear combination of the equations of newer system, then it follows that both the systems of linear equations have the same solution.
If two systems of linear equations have the same set of solutions we say that they are equivalent. What we have shown above is that:
If two systems of linear equations are such that any equation of one system is a linear combination of the equations of the other system and vice-versa, then both the systems are equivalent.
The idea of solving these systems of equations is then to find simpler but equivalent systems by making linear combination of existing equations. Moreover the goal is to eliminate variables one by one which means that the linear combinations be so chosen that the coefficients of some variables in the resulting equations turn out to be zero. We further note that while making linear combination of equations we are just manipulating the coefficient of the variables and hence it might be better to drop the variables altogether and work with the coefficient/augmented matrix of the system. The process of linear combination of the equations thus leads to a corresponding operation on the rows of the augmented matrix.
It turns out that we need to have only those kind of operations on the rows of the matrix which lead to equivalent systems. Thus we should be able to get back the original matrix by applying suitable operation. With this requirement we define the elementary row operations on a matrix.
Elementary Row Operations
Let $ A$ be a matrix and let $ R_{i}$ denote its $ i^{\text{th}}$ row. Then we define three types of elementary row operations on $ A$:- Multiplying a row by some non-zero number: $ cR_{i} \rightarrow R_{i}$
- Exchange of two rows: $ R_{i} \leftrightarrow R_{j}$
- Adding a multiple of row with another row: $ R_{i} + cR_{j} \rightarrow R_{i}$
We can now apply these row operations on the augmented matrix $ \tilde{A}$ of the system of equations and try to simplify it in a form which resembles the process of elimination of variables. Thus we would want to reduce the augmented matrix into a form which contains many zero entries. A systematic description of such a form of matrix is given below:
Row Reduced Echelon Form: A matrix is said to be the in the row-reduced echelon form if it satisfies the following properties:
- any row which consists of all zero entries can not lie above a row which has some non-zero entry, thus all the zero rows should lie below the non-zero rows
- in a non-zero row the first non-zero entry must be $ 1$
- if the first non-zero entry of a row lies in a column $ j$, then all the other entries of this column must be $ 0$
- if the first non-zero entries of all the non-zero rows lie in columns $ k_{1}, k_{2}, \ldots, k_{r}$, then $ k_{1} < k_{2} < \cdots < k_{r}$.
$$ A = \begin{bmatrix} 1 & 2 & 0 & 0 & 3 \\ 0 & 0 & 1 & 0 & 2 \\ 0 & 0 & 0 & 1 & 2 \\ 0 & 0 & 0 & 0 & 0 \end{bmatrix}$$ is in row-reduced echelon form because:
- the zero rows are below the non-zero rows
- first non-zero entry in first, second and third row is $ 1$
- whichever column this first non-zero entry is, the other entries of the column are zero (check columns 1, 3, 4)
- also we can see that 1 < 3 < 4 (i.e. the first non-zero entry of a row must lie after the first non-zero entry of the previous row)
We now have the machinery to formulate a new definition of rank of a matrix and thereby put this concept in proper context. To that end we need to observe that if $ A$ is a square matrix and $ e$ is an elementary row operation then $ \text{det}(A) = 0 \Rightarrow \text{det}(e(A)) = 0$ and $ \text{det}(A) \neq 0 \Rightarrow \text{det}(e(A)) \neq 0$. Thus the rank (as defined using determinants of square submatrices) of any matrix does not change when we apply an elementary row operation on it. Hence the rank of a matrix is same as the rank of the corresponding matrix in the row-reduced echelon form. Clearly the rank of a matrix in row-reduced echelon form is the number of non-zero rows in it (because we can choose these rows and the columns which contain first non-zero entries of these rows to give a non-zero determinant, and choosing any more rows will always include a full row of zeroes and thereby lead to a zero determinant). We thus have the following result:
The rank of a matrix is the number of non-zero rows in the corresponding row-reduced echelon form.
Since the rank is a unique number, it follows that the number of non-zero rows in row-reduced echelon form is also unique and depends only on the original matrix and not on the sequence of elementary row operations used for reducing it to echelon form. In fact it is also true that the row-reduced echelon form itself is unique for a given matrix and does not depend upon the sequence of elementary row operations used to reduce the given matrix in echelon form (but we don't prove this fact here). The above result can also be taken as an alternative and proper definition of the rank of a matrix (in fact this is also a common approach in undergraduate texts).
We are now ready to prove the fundamental theorem of solution of system of linear equations and thereby show the importance of the concept of rank in these matters. However, this post has already grown a bit large in size, and we need to postpone these developments for the next and final post in this series.
Print/PDF Version
0 comments :: The Mysterious Rank (of a Matrix): Elementary Row Operations
Post a Comment