Continuous Functions on a Closed Interval: Intermediate Value Theorem

After discussing the boundedness property of the continuous functions, its time now to discuss another fundamental property of continuous functions called the Intermediate Value Theorem. This roughly means that if a continuous function takes two values $ A$ and $ B$ then it takes all values between $ A$ and $ B$. Thus the values of the function also maintain a continuity starting from one value to another. This theorem has some practical applications in solving equations for example. But more than that this is the most widely advertized property of continuous functions and is mentioned in almost every calculus book (and normally without proof). Some authors contend that this is the essence of continuity. However it is not the case as there can be discontinuous functions which possess this property.

So let's formally state the property:

Intermediate Value Theorem: If a function $ f$ is continuous on a closed interval $ [a, b]$ with $ f(a) \neq f(b)$ and $ K$ is a number between $ f(a)$ and $ f(b)$, then there is a $ c \in (a, b)$ such that $ f(c) = K$.

First let's understand the property graphically. It says that if we draw the graph of function $ f$ in interval $ [a, b]$ and draw a horizontal line passing between points $ (a, f(a))$ and $ (b, f(b))$ then this line must intersect the graph of the function at least once.

: Intermediate Value Theorem

To put it more intuitively, if some portion of a continuous curve lies on one side of a straight line and some portion of it lies on the other side of the line, then the curve must intersect the line at least once. This is quite obvious!! Why the hell do we need to prove it? This is often the expression of students who encounter this theorem for the first time. They take it as too obvious (like an axiom) to demand a proof. But obvious as the theorem may seem from appearance, the proof is not obvious and depends ultimately on the theory of real numbers.

Let's us try to use the Heine Borel Principle first. To simplify things let us define a new function $ g(x) = f(x) - K$ then since $ K$ lies between $ f(a)$ and $ f(b)$, it follows that $ g(a)$ and $ g(b)$ are of opposite signs. Also define $ g(x) = g(a)$ for $ x < a$ and $ g(x) = g(b)$ for all $ x > b$. Assume on the contrary that $ f(x) \neq K$ for any $ x \in [a, b]$ so that $ g(x) \neq 0$ for all $ x \in [a, b]$. Since $ g(x)$ is continuous we can find a neighborhood $ I_{x}$ of $x$ such that $ g(x)$ maintains its sign in $ I_{x}$. The collection of all these neighborhoods clearly forms a cover for $ [a, b]$ and hence by Heine Borel Principle we can select a finite number of them, say $ I_{x_{1}}, I_{x_{2}}, \cdots, I_{x_{n}}$, which cover the whole interval $ [a, b]$. Naturally these neighborhoods must overlap (because each point of the interval $ [a, b]$ has to be an interior point of one of these chosen neighborhoods) in such a fashion that every two consecutive neighborhoods (in order of their position on the number line) have an overlapping part. Since $ g(x)$ maintains its sign in the each of these neighborhoods it follows because of their overlapping parts that $ g(x)$ maintains its sign all throughout the interval $ [a, b]$ and therefore $ g(a)$ and $ g(b)$ are of same sign. This contradiction proves that we must have some $ c \in [a, b]$ for which $ g(c) = 0$. And clearly by the conditions of the theorem this $ c$ is not one of $ a$ and $ b$ so that $ c \in (a, b)$. Thus we have a $ c \in (a, b)$ for which $ f(c) = K$.

It is especially important to prove this property using Dedekind's Theorem. Let us assume that $ f(a) < K < f(b)$ and divide the real numbers into two sets $ L, U$ in the following way. Put all numbers less than $ a$ in $ L$ and all numbers greater than $ b$ in $ U$. For numbers $ x \in [a, b]$ put $ x \in L$ if all the values of $ f$ in the interval $ [a, x]$ are less than $ K$ and put $ x \in U$ otherwise (i.e. when there is at least one value of $ f$ which is greater than or equal to $ K$ in the interval $ [a, x]$). Clearly both the sets $ L, U$ are non-empty and form a section of the real numbers and therefore by Dedekind's Theorem there is a unique real number $ c$ with the property that all numbers less than $ c$ belong to $ L$ and all the numbers greater than $ c$ belong to $ U$. Since $ L$ contains numbers greater than $ a$ and $ U$ contains numbers less than $ b$ (because of continuity of $ f$ at $ a, b$) it follows that $ c \in (a, b)$. We shall prove that $ f(c) = K$. Let's assume on the contrary that $ f(c) < K$. Then by continuity there is a neighborhood $ [c - h, c + h]$ where all the values of $ f$ are less than $ K$. Since $ c - h \in L$ it follows that values of $ f$ in $ [a, c - h]$ are less than $ K$. It now follows that the values of $ f$ are less than $ K$ in the entire interval $ [a, c + h]$ which leads to the conclusion that $ c + h \in L$. This is a contradiction and hence we must have $ f(c) \geq K$. Similarly we can prove that $ f(c) > K$ also leads to a contradiction. Therefore the only option we have is that $ f(c) = K$. It is also important to note that the number $ c$ we have found here is the unique number $ c$ with property $ f(c) = K$ and $ f(x) < K$ whenever $ x \in [a, c)$.

It is therefore established that there is a first value of $ x$ for which $ f(x) = K$ and in a similar manner we can prove that there is a last value of $ x$ for which $ f(x) = K$. Thus for any value $ K$ in the range of a continuous function $ f$ defined on a closed interval there is a first value say $ x_{1}$ and a last value $ x_{2}$ of the independent variable such that $ f(x_{1}) = f(x_{2}) = K$. It may happen that there is only one value of $ x$ for which $ f(x) = K$ so that $ x_{1} = x_{2}$ in this case. This is more than what is commonly stated for Intermediate Value Theorem. In fact the proof given above by using Heine Borel principle does not provide any indication that there is a first and a last value of $ x$ for which $ f(x) = K$.

The same conclusion (alongwith the above mentioned fact about first and last value of $ x$) can be reached by using the Supremum principle. We only need to consider the set $ A = \{ x \mid x \in [a, b], f(y) < K\,\text{for all}\, y \in [a, x]\}$. The supremum $ c$ of set $ A$ is the number which has all the desired properties $ f(c) = K$. The proof is very similar to the one given using Dedekind's Theorem and hence we leave it as an exercise for the reader.

All the above proofs guarantee the existence of a number $ c \in (a, b)$ for which $ f(c) = K$, but they do not provide an effective method of calculating the value $ c$ in practice. The following proof based on Nested Interval Principle does in fact provide a method for calculating $ c$.

Again we switch back to the function $ g(x) = f(x) - K$ where $ g(a)$ and $ g(b)$ are of opposite signs. Let us divide the interval $ [a, b]$ into two equal sub-intervals $ [a, (a + b)/2]$ and $ [(a + b)/2, b]$. If $g$ vanishes at the mid point $(a + b)/2$ then we are done. If not, then one of these two sub-intervals must be such that $ g(x)$ has opposite signs on the end points. Choose that sub-interval and call it $ [a_{1}, b_{1}]$. Repeating the process again and again we find that either $g$ vanishes at the mid-point of one of the intervals $[a_{n}, b_{n}]$ so that we are done or we get a sequence of intervals $ [a_{n}, b_{n}]$ such that $ g(x)$ has opposite signs at the end points of each of these intervals. By Nested Interval Principle there is a unique number $ c \in [a, b]$ which is contained in all the intervals $ [a_{n}, b_{n}]$. Note that the uniqueness of $ c$ follows from the fact that $ b_{n} - a_{n} = (b - a)/2^{n} < (b - a)/n$. We shall prove that $ g(c) = 0$. In case $ g(c) \neq 0$ we will have a neighborhood $ (c - h, c + h)$ where $ g(x)$ maintains the same sign. But we can find a value of $ n$ for which $ [a_{n}, b_{n}]$ is wholly contained in $ (c - h, c + h)$ and since $ g(x)$ has opposite signs at end points of $ [a_{n}, b_{n}]$ we obtain a contradiction. This proves that $ g(c) = 0$ or equivalently $ f(c) = K$.

Applications of the Intermediate Value Theorem

First of all it provides us with proof of existence of $ n^{\text{th}}$ roots of positive numbers. Let $ \alpha > 0$ be a given real number and $ n$ be a given positive integer. Consider the function $ f(x) = x^{n} - \alpha$ which is continuous everywhere. Clearly $ f(0) < 0$ and if $ \alpha < 1$ then $ f(1) > 0$. If $ \alpha > 1$ then $ f(\alpha) > 0$. If $ \alpha = 1$ then $ f(2) > 0$. Thus in any case we have some number $ c > 0$ for which $ f(c) > 0$. Thus there is a number $ \beta$ in interval $ (0, c)$ for which $ f(\beta) = 0$. So we have $ \beta^{n} = \alpha$. Clearly this $ \beta$ is unique because $ f(x)$ is a strictly increasing function of $ x$ for $ x > 0$. The unique number $ \beta$ with the above property is denoted by symbol $ \sqrt[n]{\alpha}$ or $ \alpha^{1/n}$.

Hence given a positive number $ \alpha$ and a positive integer $ n$, there exists a unique positive number $ \beta$ such that $ \beta^{n} = \alpha$. This number $ \beta$ is called the positive $ n^{\text{th}}$ root of $ \alpha$ and denoted by $ \alpha^{1/n}$ or $ \sqrt[n]{\alpha}$.

Another famous application of the Intermediate Value Theorem is the following:

If $ f(x)$ is a non-zero polynomial of odd degree with real coefficients then there is at least one real number $ \alpha$ such that $ f(\alpha) = 0$.

The proof is more or less quite obvious. In any polynomial $ f(x)$ the term with highest power of $ x$ is the dominant term (meaning that the sign of $ f(x)$ for large values of $ x$ depends on the sign of coefficient of highest power of $ x$). In case the degree of polynomial is odd, the sign of $ f(x)$ for large negative values of $ x$ will different from the sign of $ f(x)$ for large positive values of $ x$. And hence the polynomial must vanish for some value $ \alpha$ of $ x$.

It must be remarked that this above result plays a key role in one of the simplest proofs of The Fundamental Theorem of Algebra (i.e. the theorem that states that any non-zero polynomial of positive degree with complex coefficients has a root in the complex number system).

Its now time to summarize the properties of continuous functions on closed intervals. First they have a maximum and a minimum value. Next they attain all the values in between these two extremes. It therefore follows that:

If $ f(x)$ is a continuous function defined on closed interval $ [a, b]$ then its range is also a closed interval of the form $ [c, d]$ (which may be degenerate in case $ f(x)$ is a constant).

Another simple application of the Intermediate Value Theorem is the following:
Brouwer's Fixed Point Theorem: If $ f(x)$ is a continuous function from $ [a, b]$ to itself then there is a point $ c \in [a, b]$ for which $ f(c) = c$.

Such a point $ c$ as mentioned above is called a fixed point of the function $ f$ as it is left unchanged (fixed) by applying $ f$. As an example $ x = 1$ is a fixed point for the function $ f(x) = x^{2}$. To prove the Brouwer's Theorem, we have to note that if $ f(a) = a$ or $ f(b) = b$ then our job is done. So let us assume that both of these possibilities do not hold. In that case $ f(a) > a$ and $ f(b) < b$. It follows that the function $ g(x) = f(x) - x$ has opposite signs at the end points of the interval $ [a, b]$. Since $ g(x)$ is continuous, it follows by the Intermediate Value Theorem that there is a point $ c \in (a, b)$ such that $ g(c) = 0$ or $ f(c) = c$.

Print/PDF Version

Continuous Functions on a Closed Interval: Intermediate Value Theorem

Applications of the Intermediate Value Theorem

2 comments :: Continuous Functions on a Closed Interval: Intermediate Value Theorem

Post a Comment

Paramanand's Math Notes

About

Bibliography

Topics

Archives