## An Equation Transformation Example

One of the many skills maths students develop in high school is the ability to change the position of a graph. I sometimes need to remind my students that this is not just “busy” work. It is a skill used frequently in technical fields such as science and engineering. What follows is an example from celestial mechanics.

### The Two-Body Problem

Calculating orbits and their characteristics is usually a computer-intensive exercise. For example, how far from the earth is an orbiting satellite at a particular time. However, a good first approximation of this is to pretend we are in the unrealistic universe where only two point masses exist. In this universe, the shape of orbits can be perfectly modelled with equations called conic sections. Why they are called conic sections is a bit beyond the scope of this post, but please free free to look that up.

The orbital shape I want to describe here is the ellipse. The equation of an ellipse which is centered at the origin is:$\frac{x^{2}}{a^{2}} +\frac{y^{2}}{b^{2}} =1$ where 2a is the length of the ellipse in the x direction and 2b is the length of the ellipse in the y direction. If a > b, the line along the a intercepts is called the major axis of the ellipse and the line along the b intercepts is called the minor axis:

This is a possible shape of an orbit about one of the masses in our perfect two-mass universe. In reality, orbits of say satellites around the earth, approximate this shape but other factors like other objects in the solar system and the earth not being a point mass with an uneven distribution of its mass, make the orbital shape slightly different than an ellipse. Not only that, in our perfect universe, the orbit path is exactly contained in a plane and can be completely drawn on flat paper. In reality, orbits also go below and above the average plane of the orbit.

Now one of the masses, say m₂, is on the ellipse. The other mass, m₁, is not at the center of the ellipse, the origin in the above figure. So where is it?

An ellipse has two special points associated with it call focal points. These points have the property that any point on the ellipse has a total distance between it and the focal points is a constant. This constant is 2a if the longest dimension of the ellipse is along the x-axis. Otherwise, this constant length is 2b.

The other mass, m₁, is at one of these focal points. The coordinates of these points can be found using the Pythagorus theorem. Looking at the blue triangle above, you can see that $c=\sqrt{a^{2} -b^{2}}$

### An Example

Suppose we want to analyze the relationship between two masses that are in orbit. In orbital mechanics, the shape of an orbit is given by its eccentricity, e. An elliptical orbit has an eccentricity between 0 and 1. By the way, a perfectly circular orbit has an eccentricity of 0. So let’s say we have an orbit with the major axis length 2a = 10 and minor axis length 2b = 6. This is an orbit with an eccentricity of 0.8. If we want to know the position of m₂ with respect to m₁, we need a coordinate system. A convenient one would be a Cartesian system aligning the major axis of the orbit along the x-axis and centering the ellipse at the origin. This way we can immediately write down the equation of the orbit:$\frac{x^{2}}{25} +\frac{y^{2}}{9} =1$

as a = 5 and b = 3. This means that the focal points are $c=\sqrt{25 -9} =4$from the center:

Two of the more basic pieces of information we would like to know (especially if this is a satellite orbiting the earth) is what is the distance r between m₁ and m₂ and what is the direction of m₂ with respect to m₁. We can define the direction as the angle 𝜃 of the line connecting the two masses with the positive x -axis. This is actually the reference used in much of celestial mechanics. This angle is called the true anomaly.

Well this is nice, but since we want to find relationships of m₂ with respect to m₁, it would be even more convenient to put the origin of our coordinate system at m₁. How do we do that?

### Transformations to the Rescue

Somewhere in your year 10 or 11 maths (grades 10 or 11 in the States), you took the equation of a standard parabola, y = x², and replaced the x with xh to get y = (xh)². This moved the parabola h units to the left or right depending on the sign of h.

It turns out that in any relationship between x and y, replacing the x with xh has the exact same effect on its graph. So in our example, if we move the ellipse 4 units to the left, m₁ would be on the origin of our coordinate system.

We can now calculate r and 𝜃 a bit more easily than before the transformation. Let’s look at where m₂ is in the above figure. Here, m₂ is at (-4,3). The following process would work with any point, but at this point, the numbers are “nicer”.

The figure below shows the answers. The distance r can be found using the Pythagorus theorem but once you see that the two sides are 3 and 4, then this is the standard 3-4-5 right triangle:

In this perfect universe, other orbital shapes are possible: circles, parabolas, and a hyperbolic. These have their own standard equations but they all can be transformed to move them to any place you wish on the coordinate system to make your calculations easier.

## Modelling and Graphs

This post presents a simplified engineering modelling and design example. I offer this as a motivation to students struggling with graphing topics like linear, quadratic, cubic, etc equations. This is not just busy work. It is the beginning of developing skills you will need to work in technical professions like engineering.

## Why Modelling?

Engineers use mathematical models for many reasons. Some of these reasons are:

1. Easier and cheaper to analyse and design than the physical thing being modelled
2. Cheaper to tweak to do “what if” analysis
3. Necessary when the physical object needs to be encoded, eg a control system for an aircraft – the control system needs to know how the aircraft will react to its controls.

## The Example

Let’s use a mathematical model in a familiar scenario.

### The Spring

Early automobiles used a suspension that just consisted of leaf springs to support the entire carriage. Later, coil springs for each wheel were used. With just springs, what is the reaction of the carriage (the part that people sit in) to a bump in the road? I think you can imagine that it would look like this:

This is a plot of a cosine function which you may have seen already. This is a model of what would happen to a car with just springs. But it’s not a good model since it assumes that the oscillations would go on forever. Of course, in real life, the springs would lose energy and there would be wind resistance as well. But engineers typically start out with a simple model and add complexity as they continue analysing it.

This particular function I just plotted is x = 5cos(3t), where x is the position of the carriage and t is time. The “5” corresponds to the strength of the bump in the road, and the “3” relates to how weak or stiff the spring is.

The letter k is usually used to represent the spring stiffness. The higher k is, the stiffer the spring and the faster the oscillations experienced by the people in the car. Changing the “5” just changes the extreme values of the curve, but changing the “3” has a more interesting effect. Below are three graphs showing the effect of changing k:

### The Damper

As I mentioned before, just using a cosine function to model the spring suspension is not good because there is natural damping that occurs due to energy loss in the spring (heat) and resistance due to the air around the car. The real reaction of the carriage would look something like this:

But even this would not be comfortable as the car would be frequently bouncing as it goes over bump after bump (but certainly more comfortable than no suspension at all!). So more damping would be desireable.

A function that has a damping shape is the exponential function
x = adt where the “a” and the “d” are some positive constants for a particular equation. As will be seen when you study calculus, a convenient “a” to use is the irrational number e. Don’t worry if you haven’t seen e before. It is just a number approximately equal to 2.71828. I say approximately because e, like 𝜋, is irrational so it has a non-repeating decimal part. The d indicates how strong the damping effect is; the larger d is, the stronger the damping.

So as an example, the plot of x = 5e-0.5t looks like:

The “5” in his example is just to provide the initial bump. So it would be nice if we could combine the action of the spring with a damping force that a shock absorber would provide. This can be done by multiplying the shock absorber model
x = 5e-0.5t and the spring model x = 5cos(3t) together:
x = 5e-0.5tcos(3t) where we just need the one “5” for the initial bump. The plot of this looks like:

Much better but room for improvement. Depending on the type of car we are designing, we may want a stiff response where the damping is strong (like in a sports car where one wants to “feel the road”) or a softer one for a family car for example.

Below are two plots changing the damping strength and using k = 3:

A carriage following the red curve would be a better ride. The damping can be made stronger for a sports car.

When the car is actually built and testing reveals that the car responds to bumps as was predicted by the engineer’s model, well that’s a very good feeling!

### Why Study Maths?

The engineer in this discussion used several skills that you are studying now:

1. Drawing graphs of equations
2. How to change the shape of those graphs my changing the values in the equation
3. Circular (trigonometric) and exponential functions
4. Calculus
5. And all of the above is based on algebra.

The calculus skill is not apparent in the discussion so far, but I will explain that further in the next section.

### Creating a Model

We can generalise the model for the suspension of a car as x = Aedtcos(kt) where A is the size of the initial bump in the road, d is the stiffness of the shock absorbers, and k is the strength of the springs. How did I know to use the exponential and cosine functions? They are the solution to what are called differential equations which are based on calculus.

When developing a model, especially for dynamic (moving) systems, they start with basic physics. Frequently, Newton’s second law is the starting point:$F = ma$ where F is the force applied or exerted by a mass m, accelerating (or decelerating) by a. Now, we want a model that predicts x, the position of the car carriage. Many applications of calculus involve the rate of change of something. The rate of change of position is called velocity which you are familiar with. And the rate of change of velocity is called acceleration. Newton’s second law explains why you feel your back press against the seat only when you accelerate (or against the seatbelt when you decelerate). The rate of change of position, that is velocity, is mathematically called the first derivative of position. The rate of change of velocity, that is acceleration, is the first derivative of velocity or the second derivative of position. So Newton’s second law is really an equation about the second derivative of position. Equations that have derivatives in them, are called differential equations and these are very familiar to engineers.

Notation-wise, a single dot is shown above a variable to indicate a first derivative and two dots represent a second derivative. So another way to show Newton’s second law is $F=m\ddot{x}$ as acceleration is the second derivative of the position, x.

Now without going into the detail, the forces exerted on the carriage by the springs and the shock absorbers are kx and dẋ respectively. So if you replace the F in Newton’s second law with these forces, you get the differential equation$kx+d\dot{x}=m\ddot{x}$

An engineer uses calculus to solve this equation for the position x as a function of the time t, that is x(t). Given certain initial conditions like the size of the bump experienced by the car, and the values of k and d, a solution to this equation can be x = Aedtcos(kt). This is where I got the cosine and the exponential function from at the beginning.

Quadratic equations come in several generic forms (or patterns) but they all have several things in common:

1. The highest power of x (the independent variable) is 2 when all expressions are expanded in polynomial form.
2. The other integer powers, 1 meaning just x, and 0 meaning a constant term, may or may not be present. But the squared term must be present.
3. Other powers (negative integers, and non-integers), cannot be present.

The most common general form of a quadratic equation is $y=ax^2 +bx+c,$ where the ab, and c are constants specific in a particular equation. For example, $y=3x^2 -2x+7.$ Here a = 3, b = -2, and c = 7. There are other forms as well:$\begin{array}{l} y=k\left(x+a\right)\left(x+b\right), y=a{\left(x-h\right)}^2 +k \end{array},$where the unspecified constants a, and k are specific to each form and are not the same numbers when converting between each of these forms. The most basic quadratic equation is $y=x^2.$Choosing various values of x, then squaring them to get the corresponding y values, and plotting these on a Cartesian coordinate grid, creates the following curve:

This shape is called a parabola. All quadratic equations have this shape when plotted but their position, orientation, and scale may be different. Each form of quadratic equations have their advantages. This lesson however, will concentrate on the form $y=a{\left(x-h\right)}^2 +k.$

## The ‘a’ Factor

So we will be looking at the quadratic form $y=a{\left(x-h\right)}^2 +k.$ This form is called the turning point form. Let’s start out simple and look at the effect of a alone by setting the h and k to zero. This leaves us with the equation $y=ax^2.$ This coefficient in front of the x2 term scales and orients the parabola. If a is negative, all the y values are now negative. This flips (reflects) the parabola across the x-axis. If a is a large number, greater than 1, then the y values are larger for a given x than the y values in the basic y = x2 parabola. This has the effect of making the parabola sharper, that is, it is dilated along the x-axis. If a is a fraction between -1 and 1, then the y values increase more slowly. This has the effect of making the parabola flatter which is also a dilation along the x-axis. Below are several graphs of y = ax2 for various values of a. The basic parabola is shown (dashed curve) for comparison:

Notice how the negative sign flips the parabola across the x-axis.

## The k Effect

Now let’s look at the equation $y=ax^2 +k.$I have just added a k to the previous equation form. If you add or subtract a constant number to an equation, it just raises or lowers the graph of the equation by k units. This is independent of the effect that a has on the curve. Below are examples for two choices of k using an a that was used above for comparison:

## The h Reaction

Now so far we have scaled and inverted our parabola and moved it up or down. What about moving it right or left? That’s what the h does in the form $y=a{\left(x-h\right)}^2 +k.$ We get to this form by replacing x with x – h. Whenever this is done in any equation, as well as the quadratic equation, this moves the curve to the right h units if h  is positive or to the left if h is negative. But be careful. There is a negative sign in the form y=a(xh)2+k, so in y=a(x-3)2+k, h = 3 so this parabola is moved 3 units to the right. Whereas,  a(x+3)2+k can be thought of as a(x-(-3))2+k, so h is -3 and this moves the parabola 3 units to the left. The effect of h on the graph of a parabola is independent of the effects of a or k.

Below are two examples of the effect of h using the last example above for comparison:

## The Derivative, Part 8

Now let’s do some more examples using not only the chain rule, but using a combination of the rules we have covered.

Let me start with an example that illustrates the “chainyness” of the chain rule. Let $f( x) =\text{sin}\left(\sqrt{x^{2} +2x-7}\right)$

Notice that there are three operations at work here: the sine, the square root and the polynomial. Referring back to the previous post, which is the outermost function? It’s the sine as that would be the last operation you would perform if you were to actual calculate the function for a particular x value. So the derivative rule for the sine is the first differentiation rule we will use.

So we have the sine of “something” so we start with the derivative of that something: $f'( x) =\text{cos}\left(\sqrt{x^{2} +2x-7} \right)\ ( …)$ Now from the last post, you know you have to multiply this by the derivative of that “something”. It will be helpful to rewrite that “something” as $\sqrt{x^{2} +2x-7} =\left( x^{2} +2x-7\right)^{1/2}$ Looks like we need to apply the chain rule again as we have an inner (polynomial) and an outer (power) functions.

The derivative of the “something” to the 1/2 power is $\frac{1}{2}\left( x^{2} +2x-7\right)^{-1/2}( …)$

We are now left with the innermost function x² + 2x – 7. The chain rule says to multiply the previous results with the derivative of this innermost function which is 2x +2. So putting this in the last (…) and then putting that result in the first (…) gives $f'( x) =\frac{1}{2}\left( x^{2} +2x-7\right)^{-1/2}( 2x+2)\text{cos}\left(\sqrt{x^{2} +2x-7}\right)$ Do you see how the successive differentiations of the functions from the outermost to the innermost works with the chain rule?

Let’s do another example. Let’s differentiate $f( x) =\sqrt{\text{sin}( x)\text{cos}( x)}$

As we did before, it’s easier to see the applicable differentiation rule if you convert the square root to its equivalent exponent form:$f( x) =\left[{\text{sin}( x)\text{cos}( x)}\right]^{1/2}$

Hopefully you can now identify the outermost operation as raising “something” to the 1/2 power. So the power rule is the one to use at first:$f'( x) =\frac{1}{2}[\text{sin}( x)\text{cos}( x)]^{-1/2}( …)$

So we now need to multiply this by the derivative of the “something” which is sin(x)cos(x). But this is the multiplication of two functions so we need to use the multiplication rule. Letting u = sin(x) and v = cos(x), then uv + uv‘ becomes cos²(x) – sin²(x). So now replacing the (…) with this results in $f'( x) =\frac{1}{2}[\text{sin}( x)\text{cos}( x)]^{-1/2}\left[\text{cos}^{2}( x) -\text{sin}^{2}( x)\right]$

This last example highlights the point that to find the derivative of complex functions frequently requires the use of several differentiation rules. You need to be aware of where you are in a particular problem and which rule you are currently working on.

Next time, I will show some examples where the derivatives are used. In the meantime, you can use the results of derivatives found in this post to find the derivative of$f( x) =\sqrt{\text{sin}( x)\text{cos}( x)}\text{sin}\left(\sqrt{x^{2} +2x-7}\right)$

## The Derivative, Part 7

So let’s recap: we have a rule to find derivatives of basic functions using a table, a rule to handle a function that is multiplied by a constant, a rule to handle the addition or subtraction of two (or more) functions, a rule to handle the multiplication of two (or more) functions, and a rule to handle the division of two functions. I also did an example where several of these rules can be used finding the derivative of a single function. You would think that this would exhaust all the possibilities and that you can now differentiate any function in the known universe. But alas, there is one more, perhaps the most powerful, rule yet to be presented.

This new rule is called the chain rule, so called because it allows you to find the derivative of a function, of a function, of a function, and so on.

Now there is a textbook way to present this rule and an intuitive way which I like to use. I find that the textbook approach can be confusing because there are several variables variables to keep track of. I will present both ways so that you may see the connection between the two and have a better understanding of the chain rule.

The textbook approach to the chain rule is a bit easier to see if we forego functional notation and go back to using y. However, whenever you have a function of a function f[g(x)], the chain rule is to be used. Functions like this are called composite functions. For example,

$f( x) =\text{sin}\left( x^{2}\right)$

Here, g(x) = x² and f(x) = sin(x). So f(x²) = sin(x²).

In the textbook approach you let u be the inner function (that is the function you are using as the argument for the outer function) and you let y be the function after you replace the inner function with u. I will give an explanation later on how to identify the inner and outer functions if that is not clear.

So in this case, u = x², so y = sin(u). The textbook chain rule is

$\frac{dy}{dx} =\frac{dy}{du} \times \frac{du}{dx}$

This may look scary but let me repeat this rule in English: the derivative of a composite function is the derivative of y with respect to u times the derivative of u with respect to x. So in our example, dy/du = cos(u) (using the table) and du/dx = 2x. Multiplying these together and replacing u with its definition, we get

$\frac{dy}{dx} =\text{cos}( u)\times 2x\ =\ 2x\ \text{cos}( x^{2})$

So to further explain what inner and outer functions are, suppose you wanted to take our example function and calculate its value for a certain number for x. The first thing you would do is take that number and square it. The squaring function is the inner function since it is the first thing you would do. Then you would take the sine of that squared number. The sine function then is the outer function as that is the last operation you would do.

So I explain the chain rule as follows: Take the derivative of the outer function of ‘something’ keeping the ‘something’ intact, but since the ‘something’ is not just ‘x‘ you need to multiply the result by the derivative of that ‘something’.

In this example, the ‘something’ is x². So the derivative of the sine of that ‘something’ is cos(x²), but I then need to multiply this by the derivative of the ‘something’. The derivative of x² is 2x, so the result is 2x cos(x²).

So now let’s reverse the roles of the the inner and outer functions. Consider the derivative of [sin(x)]². A very common shortcut notation for the square of a trig function like this is [sin(x)]² = sin²(x). Again, imagine actually calculating this for a particular value of x. You would first take the sine of that number (the inner function) then square the result (the outer function). We know that the derivative of the square of ‘something’ is 2 times that ‘something’ to the first power which in this case is 2 sin(x). But to compensate for this simplification, we need to multiply the result by the derivative of that ‘something’. In this case, the derivative of sin(x) is cos(x), so the final answer is 2 sin(x)cos(x).

Now to get comfortable with this, we need to do some more examples. I will do that in my next post.

## The Derivative, Part 6

Last time I presented the multiplication rule of differentiation to be used when given a function that is the multiplication of two or more other functions. As you would guess, there is also a rule that handles the division of two functions.

Let’s say you have the function

$f( x) =\frac{x^{2}}{\text{sin}( x)}$

This one can be solved with the multiplication rule if you remember that 1/sin(x) = csc(x). But as I haven’t told you what the derivative of csc(x) is, we are stuck using the following division rule. But this highlights the point that as we get deeper into maths, there are often several ways to solve a problem. The maths “arteest” is one that solves a problem elegantly.

So the following rule is the division rule. Again, I will use u(x) and v(x) to split the function up into its parts. If you have a function of the form

$f( x) =\frac{u( x)}{v( x)}$

then the derivative of f(x) is

$f'( x) =\frac{u( x) v'( x) -u'( x) v( x)}{[ v( x)]^{2}}$

As you can see, this rule is a bit more complex which is why you would use a simpler rule if possible. But it is still relatively easy to use if you keep track of which part is u and which part is v.

Using the example function above,

$\begin{array}{{>{\displaystyle}l}} u( x) =x^{2} ,\ \ \ \ \ v( x) =\text{sin}( x)\\ u'( x) =2x,\ \ \ \ v'( x) =\text{cos}( x) \end{array}$

So according to the division rule,

$f'( x) =\frac{x^{2}\text{cos}( x) -2x\text{sin}( x)}{\text{sin}^{2}( x)}$

Now you can use many rules in a single differentiation problem consider

$f( x) =\frac{x^{2} e^{x}}{\text{sin}( x)}$

Here, the numerator is a multiplication of two functions. So when using the division rule, you need to apply the multiplication rule for the u‘ part:

$\begin{array}{{>{\displaystyle}l}} u( x) \ =\ x^{2} e^{x} \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ v( x) =\text{sin}( x)\\ u'( x) =x^{2} e^{x} +2xe^{x} \ \ \ \ \ v'( x) =\text{cos}( x) \end{array}$

I’ll leave it as an exercise for you to see if I correctly found u‘(x) using the multiplication rule. I used the fact that (as seen from the table I provided a couple of posts before) is its own derivative. Anyway, using the division rule,

$f'( x) =\frac{x^{2} e^{x}\text{cos}( x) -\left( x^{2} e^{x} +2xe^{x}\right)\text{sin}( x)}{\text{sin}^{2}( x)}$

So you might be thinking that you can differentiate any function as long as you know the derivatives of the individual parts. So how would you differentiate

$f( x) =\text{sin}\left( x^{2}\right) ?$

This is not a multiplication of functions, but rather a function of a function. I will introduce the very powerful chain rule as it applies to differentiation in my next post.

## The Derivative, Part 5

So last time, I provided a table of derivatives given a function that is of a particular form. Because of rules 3 and 4 (you will need to see my last post to see what these are), along with the other entries in the table, you can now differentiate many functions not specifically in the table. But there are still many functions that you cannot differentiate without other rules. For example, if

$f( x) =x^{2}\text{sin}( x)$

there is no table entry to help you. Even though you can differentiate x² and sin(x) separately, there is no rule in the table that allows you to differentiate their multiplication together since they are both functions of x, that is, neither one is just a constant. You can’t use rule 4 here.

There is a differentiation rule that handles this. It is the multiplication rule and it states that if you have a function of the form

$f( x) =u( x) v( x)$

then the derivative is

$f'( x) =u( x) v'( x) +u'( x) v( x)$

This can be proven using the basic definition of a derivative, but you can just take my word for it.

So in the example at the beginning of this post,

$\begin{array}{{>{\displaystyle}l}} u( x) =x^{2} ,\ \ \ \ \ v( x) =\text{sin}( x)\\ u'( x) =2x,\ \ \ \ v'( x) =\text{cos}( x) \end{array}$

where I used the table in my last post to find the individual derivatives. So according to the rule,

$f'( x) =x^{2}\text{cos}( x) +2x\text{sin}( x)$

Now this rule can be extended to handle more than two functions multiplied together. If

$f( x) =u( x) v( x) w( x)$

then you can use the original rule twice, or

$f'( x) =u( x) v( x) w'( x) +u( x) v'( x) w( x) +u'( x) v( x) w( x)$

I think you can see the pattern here. So if

$\begin{array}{{>{\displaystyle}l}} f( x) =x^{2}\text{sin}( x)\text{cos}( x)\\ u( x) =x^{2} ,\ \ v( x) =\text{sin}( x) ,\ \ \ w( x) =\text{cos}( x)\\ u'( x) =2x,\ \ v'( x) =\text{cos}( x) ,\ \ w'( x) =-\text{sin}( x) \end{array}$

So the derivative is

$f'( x) =-x^{2}\text{sin}^{2}( x) +x^{2}\text{cos}^{2}( x) +2x\text{sin}( x)\text{cos}( x)$

Now this can be simplified using trig identities but I will leave it here.

What about a function that’s a division of two functions? Yes there is rule for that as well, but I’ll cover that in my next post.

## The Derivative, Part 4

Last time I provided some general rules for finding the derivatives for functions of different forms. Let me summarise these and provide some new ones as well. The new ones can be developed using the basic definition of the derivative. Letters a, and n are constants and are not a function variable:

These rules can be used for more than the explicit function forms included, especially using rules 3 and 4. For example if

$f( x) =3x^{3} -2x^{2} +5x-7$

then by using rules 2, 3, and 4 you can find the derivative as

$f'( x) =9x^{2} -4x +5$

Now let’s look at a more complex function:

$f( x) =3\text{sin}( 2x) -2\text{cos}( 3x) -0.25x^{2}$

where x is in radians. We can use rules 2, 3, 4, 6, and 7 and take the derivative of each term to get

$f'( x) =6\text{cos}( 2x) +6\text{sin}( 3x) -0.5x$

Now let’s look at a common use for derivatives. It is often needed to find the maximum and minimum of a function. Let’s look at the function f(x) = 3x³-10x²+9x:

We would like to know where (the x value) the peak (local maximum) and the local minimum occur and what the values of the function are at those points. As you have seen before, the gradient of the tangent lines at these points are zero. Since the derivative of a function gives us the gradient, we can find the derivative and find the values of x that make it zero. Using our rules for derivatives, f‘(x) = 9x²-20x+9. So we want to find the solutions to

f‘(x) = 9x²-20x+9 = 0

Using the quadratic formula, the two solutions are x = 0.627 and 1.595. We can evaluate the original function at these values of x to get the two points (0.627, 2.451) as the local maximum and (1.595, 1.088) as the local minimum.

A practical use of this is to find the maximum height a ball achieves that is thrown up into the air. Using physics to come up with the equations of motion of the ball, one can find the answer.

Even though I have shown that we can now differentiate a plethora of functions, there are still some functional forms that we cannot differentiate using the rules presented so far. I will cover some new rules in my next post.

## The Derivative, Part 3

Now that we have some confidence that the derivative definition gives correct results of functions that we know the answer to, let’s look at a functional form where the answer is not known.

Consider f(x) = x². As you know, this function plots as the standard parabola. The slope of a tangent line on this curve (its rate of change) is not constant, unlike the cases we have looked at before, but it depends on where we are on the curve:

$f'( x) =\lim _{h\rightarrow 0}\frac{f( x+h) -f( x)}{h} = \lim _{h\rightarrow 0}\frac{( x+h)^{2} -x^{2}}{h} =$ $\lim _{h\rightarrow 0}\frac{x^{2} +2xh+h^{2} -x^{2}}{h} =\lim _{h\rightarrow 0}\frac{h( 2x+h)}{h} =\lim _{h\rightarrow 0}( 2x+h) =2x$

So again, we do some algebraic manipulation that gets rid of the h in the denominator. Remember, as we are taking the limit as h approaches 0, the x is essentially treated as a constant. So the final answer is f‘(x) = 2x. Refering back to the graph, this satisfies the tangent line slopes at -1 and 1: f‘(-1) = -2, f‘(1) = 2. At any other point on the graph, just evaluate f‘(x) = 2x to find the rate of change of f(x) = x² at a particular x.

Now do you have to evaluate the definition for every different function you come across? Thankfully, the answer is no. Mathematicians have long ago done the hard work for you but because of the properties of limits, many general rules can be made. For example, if you know the derivative of a function, but what you have is the same function but multiplied by a constant, the derivative of this new function is just the same constant times the derivative of the old function. For example, we now know that for f(x) = x², f‘(x) = 2x. But what about g(x) = 3x²? Well, g‘(x) will just be 3 times the derivative of x², so g‘(x) = 6x.

So the rule is, if g(x) = af(x) where a is a constant number, then g‘(x) = af‘(x). Another generic rule is that the derivative of a sum of functions is the sum of the individual derivatives: If h(x) = f(x) + g(x), then h‘(x) = f‘(x) + g‘(x).

It turns out that if

$f( x) =ax^{n}$

where n is any real number except -1, then

$f'( x) =anx^{n-1}$

So to find the derivative in this case, you just multiply the function by n and reduce the value of the exponent by 1.

Next time, I will present a table of common derivatives and do some sample problems.

## The Derivative, Part 2

I ended my last post with the rather daunting definition of the derivative:

$f'( x) \ =\lim _{h\rightarrow 0} \ \frac{f( x+h) -f( x)}{h}$

I will now show how this definition can be used to find much simpler ways to calculate a derivative.

Let’s start with an example that we already know the answer to and is the simplest function we can think of, f(x) = c where c is some constant. You know that if the function does not change anywhere over the values of x, its rate of change (derivative) is zero. You see this if you plotted the function – it’s a horizontal line and a horizontal line has a gradient of zero. So f‘(x) = 0. Let’s see if the derivative definition gives us the same answer.

$f'( x) \ =\lim _{h\rightarrow 0} \ \frac{f( x+h) -f( x)}{h} =\lim _{h\rightarrow 0} \ \frac{c-c}{h} \rightarrow \frac{0}{0}$

Well that didn’t help much – we just got an indeterminate form 0/0. But as I said in my last post, there will always be some algebraic manipulation required to remove the problem.

One common method is to multiply the numerator and denominator by the same fraction. This does not change the value of the expression but if you use the right fraction, it removes the issue. In this case, multiply top and bottom by 1/h.

$f'( x) \ =\lim _{h\rightarrow 0} \ \frac{\frac{1}{h}( c-c)}{\frac{1}{h}( h)} =\lim _{h\rightarrow 0} \ \frac{\frac{1}{h}( 0)}{1} =\lim _{h\rightarrow 0} \ \frac{0}{1} =0$

Notice that by doing that, we got rid of h and we are left with 0/1 which is definitely 0. So the first rule of finding a derivative: if f(x) = c, then f‘(x) = 0.

Now let’s look at a more complex function, but again, one you know the answer to. The generic equation of a line is f(x) = mx + c where m and c are specific numbers: f(x) = 3x + 7 is an example. Again, from your studies of linear equations, you know this kind of function will plot as a straight line with a gradient of m. So we know that if f(x) = mx + c, then f‘(x) = m. Does our definition give the same result?

$f'( x) \ =\lim _{h\rightarrow 0} \ \frac{m( x+h) +c-( mx+c)}{h} =\lim _{h\rightarrow 0} \ \frac{mx+mh+c-mx-c}{h} \$ $=\lim _{h\rightarrow 0} \ \frac{mh}{h} =\lim _{h\rightarrow 0} \ m=m$

So when we enter in the particular function into the definition, then expand it, get rid of the terms that cancel, then cancel the common factor h, we again get rid of the dependency on h. We are left with the limit of a constant m as h approaches 0. But as m does not care what h does, the answer is just m – just what we expected. So we now know if f(x) = mx + c, then f‘(x) = m.

Next time, I will do the same thing but use functions for which we don’t know the answer.