In the previous post we had developed the concept of continuous functions and some of their local properties. It is now time to study some of the properties which apply to functions which are continuous on an interval. It turns out that the most useful and beautiful results present themselves when we study the functions defined on a closed interval. The magic goes away when the intervals under discussion are not closed. Why this happens (i.e. no magical properties for open intervals) is a further subtle point which we will not discuss rightaway.
In our further discussions $ [a, b]$ will be used to denote the typical closed interval and an implicit assumption would be that it is not degenerate i.e we must have $ a < b$. We need to take note of the following two local properties properties we established in the previous post:
So back to the first property:
Functions continuous on a closed interval are bounded in that interval.
We need to extend the definition of the function $ f$ beyond interval $ [a, b]$ to allow the following proofs to work. We define $ f(x) = f(a)$ for all $ x < a$ and $ f(x) = f(b)$ for all $ x > b$. This way the function $ f$ becomes continuous everywhere. We know that the continuity of $ f$ at a single point $ x \in [a, b]$ implies the existence of a neighborhood $ I_{x}$ of $ x$ in which the function is bounded. The collection of all such intervals $ I_{x}$ corresponding to each $ x \in [a, b]$ is clearly a cover of $ [a, b]$ and hence (by Heine Borel principle) we can choose a finite number of such intervals say, $ I_{x_{1}}, I_{x_{2}}, \cdots, I_{x_{n}}$ such that they cover the entire interval [a, b]. Since $ f$ is bounded in each of these chosen intervals too, therefore it is bounded in entire interval [a, b]. This is a one line application of the Heine Borel principle. The fact that finite number of intervals with some property are able to cover the whole interval $ [a, b]$ becomes quite useful as it is far simpler to handle (and to comprehend) a finite number of things than doing the same for an infinite number of things.
However one line applications of some principle do not demystify the whole magic and therefore it is necessary to look at the above property with different perspectives. To that end we begin by assuming the contrary that $ f$ is unbounded in the interval $ [a, b]$. Divide the interval $ [a, b]$ into two equal sub-intervals using the midpoint $ (a + b) / 2$ so that the new intervals are $ [a, (a + b)/2]$ and $ [(a + b)/2, b]$. Since $ f$ is unbounded in original interval $ [a, b]$ it must be unbounded in at least one of these sub-intervals, say $ [a_{1}, b_{1}]$. Repeat the same steps to get a sequence of intervals $ [a_{2}, b_{2}], [a_{3}, b_{3}], \cdots$ such that $ f$ is unbounded in $ [a_{n}, b_{n}]$ for each value of $ n$. Clearly by Nested Interval Principle there is at least one point $ c$ which lies in all the intervals $ [a_{n}, b_{n}]$.
Also here $ b_{n} - a_{n} = (b - a)/2^{n} < (b - a) / n$ so that this is the particularly interesting case of Nested Interval Principle where there is only one point $ c$ which lies in all intervals $ [a_{n}, b_{n}]$ (clearly we could not have two distinct points $ c, d$ with $ c < d$ otherwise we would have $ b_{n} - a_{n} \geq d - c$ and this would contradict $ b_{n} - a_{n} < (b - a) / n$ as soon as $ (b - a)/n < d - c$).
Since $ f$ is continuous at $ c$ therefore it is bounded in some neighborhood of $ c$ say, $ (c - h, c + h)$. Since $ c$ is supremum of $ \{a_{n}\}$ and infimum of $ \{b_{n}\}$ it follows that we can find $ m, n$ such that $ c - h < a_{m}$ and $ b_{n} < c + h$. Lets suppose that $ n \geq m$ then clearly $ c - h < a_{n} < b_{n} < c + h$ and hence $ f$ must be bounded in $ [a_{n}, b_{n}]$. This is exactly the contradiction we needed and our job is done.
A further direct proof is possible using the Supremum principle. Since $ f$ is clearly bounded in some neighborhood of $ a$, it follows that the set $ A = \{ x \mid x \in (a, b], f\text{ is bounded in } [a, x]\}$ is non-empty and bounded above by $ b$. Hence it has a supremum $ c$. We will show that $ c = b$ and $ c \in A$. First if $ c < b$ then since $ f$ is continuous at $ c$, it is bounded in some neighborhood $ (c - h, c + h)$. Also we can choose $ h$ small enough so that $ (c - h, c + h)$ is contained in $ [a, b]$. Then clearly there is a member $ d \in A$ such that $ c - h < d$ and $ f$ is bounded in $ [a, d]$. It follows that $ f$ is bounded in $ [a, c + h]$ so that $ c + h \in A$ which is contrary to the fact that $ c$ is supremum of $ A$. Thus $ c = b$. Now we can show that $ c = b \in A$. Clearly there is an interval of type $ (b - h, b]$ in which $ f$ is bounded (because it is continuous at $ b$). Also there is a point $ p \in A$ with $ b - h < p$ and $ f$ is bounded in $ [a, p]$ so that $ f$ is bounded in $ [a, b]$ and therefore $ b \in A$.
And finally we can definitely use the fundamental Dedekind's Theorem here. Let us then divide all real numbers into two sets $ L, U$ in following way: keep all numbers less than or equal to $ a$ in $ L$ and all numbers greater than $ b$ in $ U$. For $ x \in (a, b]$ keep $ x$ in $ L$ if $ f$ is bounded in $ [a, x]$ and put $ x$ in $ U$ otherwise. Then both the sets $ L, U$ are non-empty and $ L$ has members greater than $ a$. $ L, U$ then form a section of real numbers and therefore by Dedekind's Theorem there is a real number $ c$ such that all numbers less than $ c$ are in $ L$ and those greater than $ c$ are in $ U$. Also it is clear that $ a < c \leq b$. We can show that $ c = b \in L$. If $ c < b$ then because of continuity of $ f$ at $ c$ we have a neighborhood $ [c - h, c + h]$ where $ f$ is bounded. Also we can choose $ h$ to be small enough so that $ [c - h, c + h]$ is contained in $ [a, b]$. Then since $ c - h \in L$, therefore $ f$ is bounded in $ [a, c - h]$. It follows that $ f$ is bounded in $ [a, c + h]$ leading to the fact that $ c + h \in L$ which is not possible as $ c + h > c$. It follows that we must have $ c = b$. Now because of continuity of $ f$ at $ b$, it is bounded in $ [b - h, b]$ and since $ b - h \in L$, $ f$ is also bounded in $ [a, b - h]$. Therefore $ f$ is bounded in $ [a, b]$ and thus $ b \in L$.
All the above proofs are based on different but equivalent principles of the theory of real numbers. Without any reasonable theory of the real numbers it is simply not possible to establish this nice property of continuous functions. In fact this is exactly the reason that the common textbooks on calculus don't prove this property.
Now that we have understood in some detail that continuous functions on a closed interval are bounded, it follows that we have a supremum $ M$ and an infimum $ m$ for the set of values of $ f$ on interval $ [a, b]$. At this point we are not sure whether the function $ f$ is able to attain these values or not. If it is able to attain these values then $ M$ would become the maximum value of $ f(x)$ on $ [a, b]$ and $ m$ would become the minimum value of $ f(x)$ on $ [a, b]$.
Many a times in real world applications we are interested in some kind of optimizations and these kinds of problems when translated in mathematical terms try to seek a maximum or a minumum value of some (complicated) function. Hence it makes sense to ask whether a function can have a maximum (minimum) value under certain conditions. Clearly it is necessary for the functions to be bounded before we ask them to have a maximum (minimum) value. Since the continuity of a function on a closed interval guarantees its bounded, it makes sense to discuss this issue of attaining maximum (minimum) value for a function continuous on a closed interval. It turns out that our optimism is amply rewarded by the following theorem:
Functions continuous on a closed interval attain their minimum and maximum values on that interval.
Let's then assume that $ f$ is continuous on $ [a, b]$ and let $ M$ be the supremum of set of values of $ f(x)$ on $ [a, b]$. Suppose on the contrary that $ f(x) \neq M$ for any value of $ x \in [a, b]$. This means that the function $ g(x) = 1/(M - f(x))$ is continuous on the closed interval $ [a, b]$ and hence $ g(x)$ is bounded in $ [a, b]$. But since $ M$ is the supremum of values of $ f(x)$ it follows that given any number $ \epsilon > 0$ we can find a value $ f(x) > M - \epsilon$ so that $ g(x) = 1/(M - f(x)) > 1/\epsilon$. Since the number $ \epsilon$ was arbitrary the inequality $ g(x) > 1/\epsilon$ shows that $ g(x)$ is unbounded in $ [a, b]$. This is a contradiction which we required to complete the proof. In a similar fashion we can show that $ f$ attains the infimum too.
It is possible to apply here the Nested Interval Principle too. Since $ M$ is the supremum for values of $ f$ on $ [a, b]$, it must be the supremum for values of $ f$ on at least one of the sub-intervals $ [a, (a + b)/2]$ and $ [(a + b)/2, b]$. Let's call that sub-interval as $ [a_{1}, b_{1}]$. Repeating the procedure again and again we can get a sequence of sub-intervals $ [a_{n}, b_{n}]$ such that $ M$ is the supremum for values of $ f$ on $ [a_{n}, b_{n}]$. Clearly by the Nested Interval Principle there is a number $ c$ which lies in all the sub-intervals $ [a_{n}, b_{n}]$ and since $ b_{n} - a_{n} = (b - a)/2^{n} < (b - a)/n$ this $ c$ is the only point which lies in all these sub-intervals. We will prove that $ f(c) = M$. Suppose on the contrary that $ f(c) < M$ and let $ \epsilon = M - f(c)$. Then we have a neighborhood $ (c - \delta, c + \delta)$ such that $ |f(x) - f(c)| < \epsilon/2$ for all $ x \in (c - \delta, c + \delta)$. This clearly means that $ f(c) - \epsilon/2 < f(x) < f(c) + \epsilon/2$ in neighborhood of $ c$. Thus $ f(x) < (f(c) + M)/2 < M$ for all $ x \in (c - \delta, c + \delta)$. Hence the supremum of values of $ f$ in $ (c - \delta, c + \delta)$ is definitely less than $ M$. But clearly we can find $ a_{n}, b_{n}$ such that $ c - \delta < a_{n} < b_{n} < c + \delta$ so that the neighborhood $ (c - \delta, c + \delta)$ contains the interval $ [a_{n}, b_{n}]$ and since the supremum for values of $ f$ on $ [a_{n}, b_{n}]$ is $ M$ therefore the supremum for values of $ f$ on $ (c - \delta, c + \delta)$ is also $ M$. This contradiction shows that $ f(c) = M$.
By now the reader would have guessed how to apply the Supremum principle or Dedekind's Theorem to establish the above property and hence I leave it as a simple exercise. We will also show how the property can be proved using the Heine Borel Principle. To do so let's assume on the contrary that $ f(x) < M$ for all $ x \in [a, b]$. Then for any given point $ c \in [a, b]$ there is a neighborhood $ I_{c}$ of $ c$ such that $ |f(x) - f(c)| < \epsilon_{c} = (M - f(c))/2$ for all values of $ x \in I_{c}$. Thus we have $ f(x) < (M + f(c))/2 = M_{c}$ for all $ x \in I_{c}$ and clearly $ M_{c} < M$. The collection of all neighborhoods of the form $ I_{c}$ clearly forms a cover of interval $ [a, b]$ and hence it is possible to chose a finite number of these neighborhoods say $ I_{c_{1}}, I_{c_{2}}, \cdots, I_{c_{n}}$ which cover the entire interval $ [a, b]$. Now let $ M'$ denote the maximum of $ M_{c_{1}}, M_{c_{2}}, \cdots, M_{c_{n}}$ then $ M^{\prime} < M$. Clearly any point $ x \in [a, b]$ lies in some neighborhood $ I_{c_{i}}$ and hence $ f(x) < M_{c_{i}} \leq M^{\prime}$. It follows that $ M^{\prime} < M$ is an upper bound for values of $ f$ in $ [a, b]$. This contradiction completes the proof.
Print/PDF Version
In our further discussions $ [a, b]$ will be used to denote the typical closed interval and an implicit assumption would be that it is not degenerate i.e we must have $ a < b$. We need to take note of the following two local properties properties we established in the previous post:
- If $ f(x)$ is continuous at $ x = a$ then it is bounded in a certain neighborhood of $ x = a$.
- If $ f(x)$ is continuous at $ x = a$ and $ f(a) \neq 0$ then $ f(x)$ maintains its sign in a certain neighborhood of $ x = a$.
- If $ f(x)$ is continuous on a closed interval $ [a, b]$ then it is bounded in the interval $ [a, b]$.
- If $ f(x)$ is continuous on a closed interval $ [a, b]$ and $ f(x) \neq 0$ for any $ x \in [a, b]$ then $ f(x)$ maintains its sign in the closed interval $ [a, b]$.
So back to the first property:
Functions continuous on a closed interval are bounded in that interval.
We need to extend the definition of the function $ f$ beyond interval $ [a, b]$ to allow the following proofs to work. We define $ f(x) = f(a)$ for all $ x < a$ and $ f(x) = f(b)$ for all $ x > b$. This way the function $ f$ becomes continuous everywhere. We know that the continuity of $ f$ at a single point $ x \in [a, b]$ implies the existence of a neighborhood $ I_{x}$ of $ x$ in which the function is bounded. The collection of all such intervals $ I_{x}$ corresponding to each $ x \in [a, b]$ is clearly a cover of $ [a, b]$ and hence (by Heine Borel principle) we can choose a finite number of such intervals say, $ I_{x_{1}}, I_{x_{2}}, \cdots, I_{x_{n}}$ such that they cover the entire interval [a, b]. Since $ f$ is bounded in each of these chosen intervals too, therefore it is bounded in entire interval [a, b]. This is a one line application of the Heine Borel principle. The fact that finite number of intervals with some property are able to cover the whole interval $ [a, b]$ becomes quite useful as it is far simpler to handle (and to comprehend) a finite number of things than doing the same for an infinite number of things.
However one line applications of some principle do not demystify the whole magic and therefore it is necessary to look at the above property with different perspectives. To that end we begin by assuming the contrary that $ f$ is unbounded in the interval $ [a, b]$. Divide the interval $ [a, b]$ into two equal sub-intervals using the midpoint $ (a + b) / 2$ so that the new intervals are $ [a, (a + b)/2]$ and $ [(a + b)/2, b]$. Since $ f$ is unbounded in original interval $ [a, b]$ it must be unbounded in at least one of these sub-intervals, say $ [a_{1}, b_{1}]$. Repeat the same steps to get a sequence of intervals $ [a_{2}, b_{2}], [a_{3}, b_{3}], \cdots$ such that $ f$ is unbounded in $ [a_{n}, b_{n}]$ for each value of $ n$. Clearly by Nested Interval Principle there is at least one point $ c$ which lies in all the intervals $ [a_{n}, b_{n}]$.
Also here $ b_{n} - a_{n} = (b - a)/2^{n} < (b - a) / n$ so that this is the particularly interesting case of Nested Interval Principle where there is only one point $ c$ which lies in all intervals $ [a_{n}, b_{n}]$ (clearly we could not have two distinct points $ c, d$ with $ c < d$ otherwise we would have $ b_{n} - a_{n} \geq d - c$ and this would contradict $ b_{n} - a_{n} < (b - a) / n$ as soon as $ (b - a)/n < d - c$).
Since $ f$ is continuous at $ c$ therefore it is bounded in some neighborhood of $ c$ say, $ (c - h, c + h)$. Since $ c$ is supremum of $ \{a_{n}\}$ and infimum of $ \{b_{n}\}$ it follows that we can find $ m, n$ such that $ c - h < a_{m}$ and $ b_{n} < c + h$. Lets suppose that $ n \geq m$ then clearly $ c - h < a_{n} < b_{n} < c + h$ and hence $ f$ must be bounded in $ [a_{n}, b_{n}]$. This is exactly the contradiction we needed and our job is done.
A further direct proof is possible using the Supremum principle. Since $ f$ is clearly bounded in some neighborhood of $ a$, it follows that the set $ A = \{ x \mid x \in (a, b], f\text{ is bounded in } [a, x]\}$ is non-empty and bounded above by $ b$. Hence it has a supremum $ c$. We will show that $ c = b$ and $ c \in A$. First if $ c < b$ then since $ f$ is continuous at $ c$, it is bounded in some neighborhood $ (c - h, c + h)$. Also we can choose $ h$ small enough so that $ (c - h, c + h)$ is contained in $ [a, b]$. Then clearly there is a member $ d \in A$ such that $ c - h < d$ and $ f$ is bounded in $ [a, d]$. It follows that $ f$ is bounded in $ [a, c + h]$ so that $ c + h \in A$ which is contrary to the fact that $ c$ is supremum of $ A$. Thus $ c = b$. Now we can show that $ c = b \in A$. Clearly there is an interval of type $ (b - h, b]$ in which $ f$ is bounded (because it is continuous at $ b$). Also there is a point $ p \in A$ with $ b - h < p$ and $ f$ is bounded in $ [a, p]$ so that $ f$ is bounded in $ [a, b]$ and therefore $ b \in A$.
And finally we can definitely use the fundamental Dedekind's Theorem here. Let us then divide all real numbers into two sets $ L, U$ in following way: keep all numbers less than or equal to $ a$ in $ L$ and all numbers greater than $ b$ in $ U$. For $ x \in (a, b]$ keep $ x$ in $ L$ if $ f$ is bounded in $ [a, x]$ and put $ x$ in $ U$ otherwise. Then both the sets $ L, U$ are non-empty and $ L$ has members greater than $ a$. $ L, U$ then form a section of real numbers and therefore by Dedekind's Theorem there is a real number $ c$ such that all numbers less than $ c$ are in $ L$ and those greater than $ c$ are in $ U$. Also it is clear that $ a < c \leq b$. We can show that $ c = b \in L$. If $ c < b$ then because of continuity of $ f$ at $ c$ we have a neighborhood $ [c - h, c + h]$ where $ f$ is bounded. Also we can choose $ h$ to be small enough so that $ [c - h, c + h]$ is contained in $ [a, b]$. Then since $ c - h \in L$, therefore $ f$ is bounded in $ [a, c - h]$. It follows that $ f$ is bounded in $ [a, c + h]$ leading to the fact that $ c + h \in L$ which is not possible as $ c + h > c$. It follows that we must have $ c = b$. Now because of continuity of $ f$ at $ b$, it is bounded in $ [b - h, b]$ and since $ b - h \in L$, $ f$ is also bounded in $ [a, b - h]$. Therefore $ f$ is bounded in $ [a, b]$ and thus $ b \in L$.
All the above proofs are based on different but equivalent principles of the theory of real numbers. Without any reasonable theory of the real numbers it is simply not possible to establish this nice property of continuous functions. In fact this is exactly the reason that the common textbooks on calculus don't prove this property.
Now that we have understood in some detail that continuous functions on a closed interval are bounded, it follows that we have a supremum $ M$ and an infimum $ m$ for the set of values of $ f$ on interval $ [a, b]$. At this point we are not sure whether the function $ f$ is able to attain these values or not. If it is able to attain these values then $ M$ would become the maximum value of $ f(x)$ on $ [a, b]$ and $ m$ would become the minimum value of $ f(x)$ on $ [a, b]$.
Many a times in real world applications we are interested in some kind of optimizations and these kinds of problems when translated in mathematical terms try to seek a maximum or a minumum value of some (complicated) function. Hence it makes sense to ask whether a function can have a maximum (minimum) value under certain conditions. Clearly it is necessary for the functions to be bounded before we ask them to have a maximum (minimum) value. Since the continuity of a function on a closed interval guarantees its bounded, it makes sense to discuss this issue of attaining maximum (minimum) value for a function continuous on a closed interval. It turns out that our optimism is amply rewarded by the following theorem:
Functions continuous on a closed interval attain their minimum and maximum values on that interval.
Let's then assume that $ f$ is continuous on $ [a, b]$ and let $ M$ be the supremum of set of values of $ f(x)$ on $ [a, b]$. Suppose on the contrary that $ f(x) \neq M$ for any value of $ x \in [a, b]$. This means that the function $ g(x) = 1/(M - f(x))$ is continuous on the closed interval $ [a, b]$ and hence $ g(x)$ is bounded in $ [a, b]$. But since $ M$ is the supremum of values of $ f(x)$ it follows that given any number $ \epsilon > 0$ we can find a value $ f(x) > M - \epsilon$ so that $ g(x) = 1/(M - f(x)) > 1/\epsilon$. Since the number $ \epsilon$ was arbitrary the inequality $ g(x) > 1/\epsilon$ shows that $ g(x)$ is unbounded in $ [a, b]$. This is a contradiction which we required to complete the proof. In a similar fashion we can show that $ f$ attains the infimum too.
It is possible to apply here the Nested Interval Principle too. Since $ M$ is the supremum for values of $ f$ on $ [a, b]$, it must be the supremum for values of $ f$ on at least one of the sub-intervals $ [a, (a + b)/2]$ and $ [(a + b)/2, b]$. Let's call that sub-interval as $ [a_{1}, b_{1}]$. Repeating the procedure again and again we can get a sequence of sub-intervals $ [a_{n}, b_{n}]$ such that $ M$ is the supremum for values of $ f$ on $ [a_{n}, b_{n}]$. Clearly by the Nested Interval Principle there is a number $ c$ which lies in all the sub-intervals $ [a_{n}, b_{n}]$ and since $ b_{n} - a_{n} = (b - a)/2^{n} < (b - a)/n$ this $ c$ is the only point which lies in all these sub-intervals. We will prove that $ f(c) = M$. Suppose on the contrary that $ f(c) < M$ and let $ \epsilon = M - f(c)$. Then we have a neighborhood $ (c - \delta, c + \delta)$ such that $ |f(x) - f(c)| < \epsilon/2$ for all $ x \in (c - \delta, c + \delta)$. This clearly means that $ f(c) - \epsilon/2 < f(x) < f(c) + \epsilon/2$ in neighborhood of $ c$. Thus $ f(x) < (f(c) + M)/2 < M$ for all $ x \in (c - \delta, c + \delta)$. Hence the supremum of values of $ f$ in $ (c - \delta, c + \delta)$ is definitely less than $ M$. But clearly we can find $ a_{n}, b_{n}$ such that $ c - \delta < a_{n} < b_{n} < c + \delta$ so that the neighborhood $ (c - \delta, c + \delta)$ contains the interval $ [a_{n}, b_{n}]$ and since the supremum for values of $ f$ on $ [a_{n}, b_{n}]$ is $ M$ therefore the supremum for values of $ f$ on $ (c - \delta, c + \delta)$ is also $ M$. This contradiction shows that $ f(c) = M$.
By now the reader would have guessed how to apply the Supremum principle or Dedekind's Theorem to establish the above property and hence I leave it as a simple exercise. We will also show how the property can be proved using the Heine Borel Principle. To do so let's assume on the contrary that $ f(x) < M$ for all $ x \in [a, b]$. Then for any given point $ c \in [a, b]$ there is a neighborhood $ I_{c}$ of $ c$ such that $ |f(x) - f(c)| < \epsilon_{c} = (M - f(c))/2$ for all values of $ x \in I_{c}$. Thus we have $ f(x) < (M + f(c))/2 = M_{c}$ for all $ x \in I_{c}$ and clearly $ M_{c} < M$. The collection of all neighborhoods of the form $ I_{c}$ clearly forms a cover of interval $ [a, b]$ and hence it is possible to chose a finite number of these neighborhoods say $ I_{c_{1}}, I_{c_{2}}, \cdots, I_{c_{n}}$ which cover the entire interval $ [a, b]$. Now let $ M'$ denote the maximum of $ M_{c_{1}}, M_{c_{2}}, \cdots, M_{c_{n}}$ then $ M^{\prime} < M$. Clearly any point $ x \in [a, b]$ lies in some neighborhood $ I_{c_{i}}$ and hence $ f(x) < M_{c_{i}} \leq M^{\prime}$. It follows that $ M^{\prime} < M$ is an upper bound for values of $ f$ in $ [a, b]$. This contradiction completes the proof.
Print/PDF Version
good discussion
Tuhin Bose
July 13, 2019 at 10:55 PM