Calculus I

Author

Aven Peters

What is calculus all about?

First, let’s do an interactive activity to explore what calculus is about and why it works. Go to www.student.amplify.com, click the key icon in the upper right of your screen, and enter the following code:

73m9tt

Why is zooming in on a graph until it looks like a line useful?

Linear functions are special because they have a constant rate of change that we can easily calculate.

That is, linear functions are changing at the same rate in every interval on the graph.

On the other hand, many functions we care about are nonlinear and have non-constant rates of change.

For example, suppose this plot represents my distance from home as a function of time while I am riding the train to campus. If I was riding at a constant speed in a perfectly straight line, this function would be perfectly linear. But in real life, that’s not usually true. Sometimes the train speeds up, slows down, or turns, so the function might be more like \(f(x) = 0.01*(x - 5)^3 + 1.5.\)

library(ggplot2)

train_function <- function(x) {0.01*(x - 5)^3 + 1.5}

ggplot(data.frame(x = c(0:12)), aes(x = x)) +
  stat_function(fun = train_function)

If I wanted to find the average rate of change of this function between \(0\) and \(10\) minutes, I could do so algebraically (rise/run).

\(m_{avg} = \frac{f(10) - f(0)}{10 - 0} = \frac{(0.01*5^3 + 1.5) - (0.01*(-5)^3 + 1.5)}{10} = \frac{1.25 + 1.5 + 1.25 - 1.5}{10} = 2.5/10 = 0.25\) miles per minute (15 miles per hour).

But that average slope masks a lot of variation within the interval. In the first two minutes, my average rate of change is

\(m_{avg} = \frac{f(2) - f(0)}{2 - 0} = \frac{(0.01*(-3)^3 + 1.5) - (0.01*(-5)^3 + 1.5)}{2} = \frac{-0.27 + 1.5 - (-1.25 - 1.5)}{2} = 0.98/2 = 0.49\) miles per minute (29.4 miles per hour).

On the other hand, from minute 2 to minute 4, my average rate of change is

\(m_{avg} = \frac{f(4) - f(2)}{4 - 2} = \frac{(0.01*(-1)^3 + 1.5) - (0.01*(-3)^3 + 1.5)}{2} = \frac{-0.01 + 1.5 - (-0.27 - 1.5)}{2} = 0.26/2 = 0.13\) miles per minute (7.8 miles per hour).

Calculus allows us to find the instantaneous rate of change at one particular moment rather than simply the average rate of change on an interval. For example, we could figure out exactly how quickly the train was moving away from home at minute 6 of my commute. To do this, we can exploit the fact that many nonlinear functions, if we zoom into function close enough, start to look linear.

Now let’s make this precise.

Approximating the instantaneous rate of change

One way of approximating the instantaneous rate of change is by finding the average rate of change on progressively smaller intervals around the point we’re interested in. For example, we could find the average rate of change in the interval \([6, 6.5]\) (from 6 minutes to 6 minutes, 30 seconds). In this case, it would be

\[m_{avg} = \frac{f(6.5) - f(6)}{6.5 - 6} = \frac{0.01*(1.5)^3 + 1.5 - (0.01*(1)^3 + 1.5)}{0.5} = \frac{0.03375 + 1.5 - (0.01 + 1.5)}{0.5} = \frac{0.02375}{0.5} = 0.0475\] miles per minute, or 2.85 miles per hour.

What if we got a little closer? On the interval \([6, 6.1]\), the average pace ends up being 0.0331 miles per minute, or 1.98 miles per hour. The following table shows what happens when we get even closer to \(6\).

Interval	Average rate of change (miles/minute)
[6, 6.1]	0.0331
[6, 6.01]	0.030301
[6, 6.001]	0.03003001
[6, 6.0001]	0.030003

So the instantaneous rate of change is probably very close to 0.03 miles/minute.

Limits and Continuity

To formalize this approach, we need something called limit notation. Limits describe the behavior of a function very, very close to a particular value, regardless of what happens exactly at that value.

For example, consider the following function:

\[f(x) = \begin{cases} 4x, & x < 2 \\ 0, & x = 2 \\ 10 - \frac{1}{2} x^2, & x > 2 \\ \end{cases}\]

What happens to \(f(x)\) when \(x\) is very close to but slightly smaller than \(2\)?

x	f(x)
1.9	7.6
1.95	7.8
1.99	7.9
1.999	7.96
1.9999	7.996

As \(x\) gets very close to \(2\) from the left, \(f(x)\) gets very close to 8. We can write this using the following notation:

\[\lim_{x \rightarrow 2^{-}} f(x) = 8.\]

Similarly, we can make a table to describe the behavior of \(f(x)\) just to the right of 2.

x	f(x)
2.1	7.7950
2.05	7.8988
2.01	7.9800
2.001	7.9980
2.0001	7.9998

As \(x\) approaches \(2\) from the right, \(f(x)\) gets very close to 8. The notation for this is

\[\lim_{x \rightarrow 2^{+}} f(x) = 8.\]

In this case, the limit from the left and the limit from the right are the same, so we can write

\[\lim_{x \rightarrow 2} f(x) = 8.\]

Note that \(f(2)\) does not equal \(8\), even though the left and right limits do. That means the function \(f\) is discontinuous at 2. In general, we say a function \(g(x)\) is continuous at a value \(x_0\) if and only if all of the following are true:

\(\lim_{x \rightarrow x_0} f(x)\) exists (that is, the limits from the right and the left are the same),
\(f(x_0)\) is well-defined and finite, and
\(\lim_{x \rightarrow x_0} f(x) = f(x_0).\)

Continuity is one of the most important ideas in calculus, and we’ll return to it soon. In the meantime, we now have the tools we need to find the instantaneous rate of change of a function.

Derivatives

Recall that in the first section, we wanted to know what speed my train was going exactly at 6 minutes. To estimate this value, we computed the average rate of change of my distance from home, \(f\), for smaller and smaller intervals around 6 minutes. That is, we found \(\frac{f(6.1) - f(6)}{6.1 - 6},\) \(\frac{f(6.01) - f(6)}{6.01 - 6},\) and so on. Using this approach, we can define a function \(f'(x)\) as follows:

\[f'(x) = \lim_{h \rightarrow 0} \frac{f(x+h) - f(x)}{h}.\]

This formula looks scary, but it’s actually a pretty simple idea: to find the instantaneous rate of change of \(f\) at a point \(x,\) we find the average slope of \(f(x)\) on the interval from \(x\) to \(x+h.\) Then we make \(h\) closer and closer to \(0\), so that our slope gets closer and closer to the slope right at \(x.\)

\(f'(x)\) is called the derivative of \(f\) at \(x,\) and it is also sometimes written in these other ways:

\(\frac{dy}{dx}\)
\(\frac{d}{dx}f(x)\)
\(D_x f\)

Example:

Recall our example from earlier, \(f(x) = 0.01*(x - 5)^3 + 1.5.\) We can use the limit definition of the derivative to find the instantaneous rate of change at \(6\) like this:

\[ f'(6) = \lim_{h \rightarrow 0} \frac{f(6+h) - f(6)}{h} = \frac{0.01*(6+h - 5)^3 + 1.5 - 0.01*(6-5)^3 - 1.5}{h}.\]

\[ f'(6) = \lim_{h \rightarrow 0} \frac{0.01*(1^3 + 3*1^2*h + 3*1*h^2 + h^3) - 0.01*1^3}{h}.\]

\[ f'(6) = \lim_{h \rightarrow 0} \frac{0.01(3*1^2*h + 3*1*h^2 + h^3)}{h}.\]

\[ f'(6) = \lim_{h \rightarrow 0} 0.01(3*1^2 + 3*1*h + h^2).\]

But remember, \(h\) is approaching 0, so the only nonzero term will be the first one:

\[ f'(6) = 0.01*3*1^2 = 0.03.\]

This is exactly what we expected \(f'(6)\) to be when we estimated it numerically.

Derivative Rules

Finding the derivative using the limit definition every time would be very cumbersome, especially for very complicated functions. Luckily, we can use the limit definition to derive some more general rules. For example,

\((f + g)'(x) = f'(x) + g'(x).\)
For any \(a \in \mathbb{R}\), if \(f(x) = a*g(x),\) then \(f'(x) = a*g'(x).\)
If \(f(x) = a\) for some constant \(a \in \mathbb{R}\), \(f'(x) = 0.\)
Power Rule: If \(f(x) = x^n,\) where \(n \in \mathbb{R},\) then \(f'(x) = nx^{n - 1}.\)
Product Rule: If \(h(x) = f(h)*g(x),\) then \(h'(x) = f'(x)*g(x) + f(x)*g'(x).\)
Quotient Rule: If \(h(x) = \frac{f(x)}{g(x)},\) then \(h'(x) = \frac{f'(x)*g(x) - f(x)*g'(x)}{[g(x)]^2}\).
Chain Rule: If \(h(x) = f(g(x))\), then \(h'(x) = f'(g(x))*g'(x).\)
If \(f(x) = e^x,\) then \(f'(x) = e^x.\)
If \(f(x) = ln(x),\) then \(f'(x) = \frac{1}{x}.\)

It’s possible to find the derivative of most functions you come across using one or more of these rules and a little creativity. If you’d like to practice applying these rules, we’ll have a chance later on. If you’re feeling a bit more adventurous, you could try to prove a few of these rules using the limit definition of the derivative–all but the last three are pretty easy to prove.

Let’s use these rules to find the derivative of our function from earlier, \(f(x) = 0.01*(x - 5)^3 + 1.5.\)

First, we can separate the two pieces using the first rule:

\[f'(x) = \frac{d}{dx}[0.01*(x - 5)^3] + \frac{d}{dx}[1.5].\] The second part, \(\frac{d}{dx}[1.5]\), is zero because it is a constant. We can also factor out the \(0.01\) because it’s a constant multiplied by a function, so we have

\[ f'(x) = 0.01*\frac{d}{dx}[(x-5)^3].\] Now let’s use the chain rule. We can write \((x - 5)^3\) as \(f(g(x))\) for \(f(x) = x^3\) and \(g(x) = x - 5.\) \(g'(x)\) is just \(1\) using the power rule, while \(f'(x) = 3x^2.\) When we put these together, we get

\[ f'(x) = 0.01*f'(g(x))*g'(x) = 0.01*3(x-5)^2*1 = 0.03(x-5)^2.\]

At \(x = 6,\) we have \(f'(6) = 0.03*(6 - 5)^2 = 0.03*1^2 = 0.03,\) which is the answer we got from the limit definition of the derivative and the numerical approximation.

Practice

Find the derivative of the following functions:

\(f(x) = 5x^2 - 3x + 1.\)
\(g(x) = 2ln(x - 3).\)
\(h(x) = x^2*e^{-x}\).
\(f(x) = \frac{1}{x^7}.\)
\(g(x) = \frac{2-x}{x+3}.\)

Derive one or more of the derivative rules using the limit definition and/or the other rules.

Differentiability

Remember when we created functions that didn’t look like a straight line when we zoomed in on them? Just as there are some functions that are not continuous, there are also some functions that are not differentiable. Formally, there are three cases where it wouldn’t make sense to take the derivative:

If a function \(f\) is not continuous at a point \(x_0\), then \(f\) is also not differentiable at \(x_0.\)
If \(\lim_{h \rightarrow 0 ^+} \frac{f(x_0+h)-f(x_0)}{h} \neq \lim_{h \rightarrow 0 ^-} \frac{f(x_0+h)-f(x_0)}{h},\) then the derivative of \(f\) at \(x_0\) does not exist.
If \(\lim_{h \rightarrow 0} \frac{f(x_0 + h) - f(x_0)}{h} = \infty\) or \(- \infty\), then \(f\) is not continuous at \(x_0.\)

Exercise: Take a few minutes to think about why these rules do or don’t make sense geometrically. Can you think of an example of each violation of differentiability?

Applications of differentiation

Differentiation (the process of taking the derivative of a function) is extremely useful. Here are a few applications that are relevant for social science and statistics.

Linear Approximation

Some function values are difficult to compute manually, but we can approximate them using the derivative with the following formula:

\[ f(x_0 + a) = f(x_0) + f'(x_0)*a.\]

For example, let \(g(x) = \sqrt{x}.\) It’s fairly difficult to compute \(\sqrt{9.1}\) by hand. However, \(g(9)\) and \(g'(9)\) are pretty easy to compute. (Mini-exercise–find \(g'(x)\) using the power rule.) We can approximate \(g(9.1)\) as

\[g(9.1) \approx g(9) + g'(9)*0.1 = 3 + \frac{1}{2\sqrt{9}}*0.1 = 3 + \frac{1}{6}*0.1 = 3.01666...\] If we compute \(\sqrt{9.1}\) directly, we get 3.016621. That’s a really good approximation! This is actually how computers calculate many function values under the hood, and back before we had computers to do it, humans did this work!

Optimization

We can also use differentiation to find the maximum and minimum values of a function. This is how we derive many formulas in statistics, including the formulas for most types of regression. A full discussion of optimization is beyond the scope of this lecture, but we can touch on it conceptually. Consider the function \(f(x) = 36 - x^2,\) which looks like this:

func <- function(x) {36 - x^2}

ggplot(data = data.frame(x = c(-7:7)), aes(x=x)) +
  stat_function(fun = func)

Visually, we can see that \(f(x)\) has a maximum at \(x = 0.\) What is true of the derivative at this point?

It turns out that for any function \(f\) that is differentiable on the interval \([a,b]\) and has a maximum or minimum value, the maximum or minimum occurs at one of the following points:

\(f(a)\) or \(f(b)\)
Any values of \(x\) for which \(f'(x) = 0.\)

There’s also something called the Second Derivative Test that helps us figure out whether these points are maxima, minima, or neither, which also relies on differentiation. For the sake of time, I won’t cover that test now, nor will I ask you to actually find any maxima or minima (although it’s easy to find practice problems elsewhere). For now, just know that being able to calculate the exact value(s) of \(x\) that maximize or minimize any differentiable function we want is essential for statistics (and many other fields), and it would be extremely difficult to do this without calculus.

Conclusion

Today, we’ve learned about one branch of calculus, differentiation. Tomorrow we will focus on another branch, integration, which was actually developed first. Differentiation is all about finding the instantaneous rate of change of a function, which turns out to be extremely useful. Integration will focus on something conceptually different–the accumulation of change by a function over an interval. The connection between differentiation and integration may not be obvious yet (and if you already know it, try not to spoil it for everyone else!). By the end of tomorrow, you should have a good grasp of what differentiation and integration are, why you might want to do them, and how they’re related.